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Abstract 

We consider first passage percolation on sparse random graphs with prescribed degree 
distributions and general independent and identically distributed edge weights assumed to 
have a density. Assuming that the degree distribution satisfies a uniform log A-condition, 
we analyze the asymptotic distribution for the minimal weight path between a pair of typical 
vertices, as well the number of edges on this path or hopcount. 

The hopcount satisfies a central limit theorem where the norming constants are expressible 
in terms of the parameters of an associated continuous-time branching process. Centered 
by a multiple of logn, where the constant is the inverse of the Malthusian rate of growth of 
the associated branching process, the minimal weight converges in distribution. The limiting 
random variable equals the sum of the logarithms of the martingale limits of the branching 
processes that measure the relative growth of neighborhoods about the two vertices, and a 
Gumbel random variable, and thus shows a remarkably universal behavior. The proofs rely on 
a refined coupling between the shortest path problems on these graphs and continuous-time 
branching processes, and on a Poisson point process limit for the potential closing edges of 
shortest-weight paths between the source and destination. 

The results extend to a host of related random graph models, ranging from random r- 
regular graphs, inhomogeneous random graphs and uniform random graphs with a prescribed 
degree sequence. 
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1 Introduction and results 



1.1 Motivation 

First passage percolation is an important topic in modern probability, due to the inherent appli- 
cations in a number of fields such as disordered systems in statistical physics, and since it arises 
as a building block in the analysis of many more complicated interacting particle systems such 
as the contact process, other epidemic models and the voter model. 
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Let us start by describing the basic model. Let ^ be a finite simply connected graph on n vertices 
(for example the box [-N,N]'^ in the Z'^ lattice, so that n = {2N + l)'^). We assign a random 
edge weight or length independently and identically distributed (i.i.d.) to each of the edges. Due 
to the random weights, this is an example of a disordered system entrusted with carrying flow 
between vertices in the graph. Fix two vertices in Q. Two functionals of interest are the minimal 
weight Ln of a path between the two vertices and the number of edges Hn on the minimal path, 
often referred to as the hopcount. We assume that the common distribution of the edge weights 
is continuous, so that the optimal paths are unique and one can talk about objects such as the 
number of edges on the optimal path. 

This model has been intensively studied, largely in the context of the integer lattice [—N, (see 
e.g. [42, 28, 32, 54]). For the power of this model to analyze more complicated interacting particle 
systems, see [45] and [24] and the references therein. Due to the interest in complex networks, 
such as social networks or the Internet, recently, this model has attracted attention on general 
random graph models. Indeed, stimulated by the availability of an enormous amount of empirical 
data on real-world networks, the last decade has witnessed the formulation and development of 
many new mathematical graph models for real- world networks. These models are used to study 
various dynamics, such as models of epidemics or random walks to search through the network 
(see e.g. [1, 51]). 

In the modern context, first passage percolation problems take on an added significance. Many 
real-world networks (such as the Internet at the router level or various road and rail networks) 
are entrusted with carrying flow between various parts of the network. These networks have both 
a graph theoretic structure as well as weights on edges, representing for example congestion. In 
the applied setting understanding properties of both the hopcount and the optimal weight are 
crucial, since whilst routing is done via least weight paths, the actual time delay experienced 
by users scales like the hopcount (the number of "hops" a message has to perform in getting 
from the source to the destination). Simulation-based studies (see e.g., [15]) suggest that random 
edge weights have a marked effect on the geometry of the network. This has been rigorously 
established in various works [5, 9, 10, 11], in the specific situation of exponential edge weights. 

In this paper, we study the behavior of the hopcount and minimal weight in the setting of random 
graphs with with finite variance degrees and general continuous edge weights. Since in many 
applications, the distribution of edge weights is unknown, the assumption of general weights is 
highly relevant. From a mathematical point of view, working with general instead of exponential 
edge weights implies that our exploration process is non-Markovian. This is the first paper that 
studies first passage percolation on random graph models in this general setting. Further, due 
to the choices of degree distribution, our results immediately carry over to various other random 
graph models, such as rank-1 inhomogeneous random graphs as introduced in [14]. 

Organization of this section. We start by introducing the configuration model in Section 1.2, 
where we also state our main result Theorem 1.2. In Section 1.3, we discuss a continuous-time 
branching process approximation, which is necessary in order to be able to identify the limiting 
variables in Theorem 1.2. In Section 1.4, we extend our results to related random graph models, 
and in Section 1.5 we study some examples that allow us to relate our results to results in the 
literature. We close with Section 1.6 where we present a discussion of our results and some open 
problems. 

Throughout this paper, we make use of the following standard notation. We let denote 

L^ d 
convergence almost surely, — > denote convergence in means, — > denote convergence in distri- 
bution, and — > convergence in probability. For a sequence of random variables (X„)„>i, we 
write Xn = Op(6„) when is a tight sequence of random variables, and Xn = Op(6„) when 

\Xn\/bn as n — )• oo. To denote that the random variable D has distribution function F, we 
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write D ^ F. For non-negative functions n i— t- /(ra), n i— t- g{n) we write /(n) = 0{g{n)) when 
\f {n)\/ g{n) is uniformly bounded, and /(n) = o{g{n)) when hm„_).oo = 0. Further- 
more, we write f{n) = Q{g{n)) if f{n) = 0{g{n)) and g{n) = 0{f{n)). Finally, we write that a 
sequence of events {£n)n>i occurs with high probability (whp) when F{£n) — ^ 1- 

1.2 Configuration model and main result 

We work with the configuration model on n vertices [n] := {1, 2, . . . , n}. Let us first describe the 
random graph model. 

Configuration model. We are interested in constructing a random graph on n vertices with 
prescribed degrees. Given a degree sequence, namely a sequence of n positive integers d = 
{di, d2, ■ ■ ■ , dn) with X]je[n] ^« assumed to be even, the configuration model (CM) on n vertices 
with degree sequence d is constructed as follows: 

Start with n vertices and di half-edges adjacent to vertex i. The graph is constructed by randomly 
pairing each half-edge to some other half-edge to form edges. Let 

ie[n] 

denote the total degree. Number the half-edges from 1 to in in some arbitrary order. Then, at 
each step, two half-edges which are not already paired are chosen uniformly at random among all 
the unpaired or free half-edges and have been paired to form a single edge in the graph. These 
half-edges are no longer free and removed from the list of free half-edges. We continue with this 
procedure of choosing and pairing two half-edges until all the half-edges are paired. Observe that 
the order in which we choose the half-edges does not matter. Although self-loops may occur, 
these become rare as n — )• oo (see e.g. [13] or [37] for more precise results in this direction). We 
denote the resulting graph by CM„(d), its vertex set by [n] and its edge set by £n- 



Regularity of vertex degrees. Above, we have described the construction of the CM when 
the degree sequence is given. Here, we shall specify how we construct the actual degree sequence 
d. We start by formulating conditions on d. We denote the degree of a uniformly chosen vertex 
V in [n] by = dy. The random variable Dn has distribution function given by 

Fn{x) = ^ 2^ 1k<x}- (1-2) 

Write log(a;)+ = log(x) for x > 1 and log(x)+ = for x < 1. We assume that the vertex degrees 
satisfy the following regularity conditions: 



Condition 1.1 (Regularity conditions for vertex degrees) 
(a) Weak convergence of vertex degree. 

There exists a distribution function F onfi such that 



Dn A D, (1.3) 



where Dn and D have distribution functions F^ and F, respectively. 
Equivalently, for any continuity point x of F, 



lim Fnix) = F{x). (1.4) 



ra— )-cxD 
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(b) Convergence of second moment vertex degrees. 



\imE[Di]=E[D% (1.5) 

where Dn and D have distribution functions Fn and F, respectively, and we assume that 

u = E[D{D-l)]/K[D]>l. (1.6) 
(c) Uniform X'^logX condition. For every Kn — t- oo, 

limsupE[L»2iog(D„/ir„) ] = 0. (1.7) 



The degree of a vertex chosen uniformly at random has distribution Dn as given in Condition 
1.1(a). By Condition 1.1(c), the degree distribution Z)„ satisfies a uniform X"^ logX condition. A 
vertex incident to a half-edge that is chosen uniformly at random from all half-edges has the same 
distribution as the random variable -D* given in (4.1), which is the size-biased version of Dn. The 
latter random variable satisfies a uniform X log X condition if and only if Dn satisfies a uniform 
X'^logX condition. As explained in more detail in Section 1.3 below, is closely related to a 
branching-process approximation of neighborhoods of a uniform vertex, and thus Condition 1.1(c) 
implies that this branching process satisfies a uniform X log X condition. By uniform integrability. 
Condition 1.1(c) follows from the assumption that lim„_j.oo E[D^ log (L>„)^] = E[D^ log (D)_|_]. 

Note that that Condition 1.1(a) and (c) imply that E[D;] E[D% i = 1,2. When the degrees 
are random themselves, then we assume that the above convergence conditions hold in probability. 
Condition (1.6) is equivalent to a giant component existing in CM„(d), see e.g. [40, 48, 49]. We 
often abbreviate /i = E[L']. Let F be a distribution function of a random variable D, satisfying 
that K[D'^ log {D)_^_] < oo. We give two canonical examples in which Condition 1.1 hold. The first 
is when there are precisely = \nF{k)'] — \nF{k — 1)] vertices having degree k. The second is 
when (di)jg[n] is an i.i.d. sequence of random variables with distribution function F (in the case 
where X]ig[n] odd, we increase dn by 1, this does not affect the results). 

As we will describe in more detail in Section 1.4, Condition 1.1 is such that it allows to extend 
our results to a range of other random graph models. 



Edge weights and shortest paths. Once the graph has been constructed, we attach edge 
weight Xg to every edge e, where {Xe)e££„ are i.i.d. continuous random variables with density 
g: [0,oo) — )• [0,oo) and corresponding distribution function G. Pick two vertices Ui and U2 at 
random and let denote the set of all paths in CM„(d) between these two vertices. For any 
path vr G ri2, the weight of the path is defined as 

Cwk-dis(7r) = ^Xe. (1.8) 

eGTT 

Let 

Ln= min c„k_dis(7r), (1.9) 

7r£ri2 

denote the weight of the optimal (i.e., minimal weight) path between the two vertices and let 
Hn denote number of edges or the hopcount of this path. If the two vertices are in different 
components of the graph, then we let L„, Hn = 00. Now we are ready to state our main result. 
Due to the complexity of the various constructs (constants and limiting random variables) arising 
in the theorem, we defer a complete description of these constructs to the next section. 
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Theorem 1.2 (Joint convergence hopcount and weight) Consider the configuration model 
CM„(d) with degrees d satisfying Condition 1.1, and with i.i.d. edge weights distributed according 
to the continuous distribution G. Then, there exist constants a, ^,13 S (0, oo) and an,^n with 
ocn — )• a, 7n — )• 7, such that the hopcount Hn and weight Ln of the optimal path between two 
uniformly selected vertices conditioned on being connected, satisfy 

Ln logn — >{Z,Q), (1.10) 



n 



as n ^ OO, where Z and Q are independent and Z has a standard normal distribution, while Q 
has a continuous distribution. 



In Remark 1.4 below, we will state conditions that imply that we can replace an and 7„ by 
their limits a and 7, respectively. Theorem 1.2 shows a remarkable kind of universality. For the 
configuration model with finite- variance degrees satisfying Condition 1.1, the hopcount always 
satisfies a central limit theorem with mean and variance proportional to logn. Also, the weight of 
the shortest weight path between two uniformly chosen vertices always is of order logn, and the 
fluctuations converge in distribution. We will see that even the limit distribution Q of L„ has a 
large degree of universality. In order to do this, as well as to define the parameters a, an, 7, 7^, 
we first need to describe a continuous-time branching process approximation for the fiow on the 
configuration model with i.i.d. edge weights. 



1.3 Continuous-time branching processes 

Before stating our results we recall a standard model of continuous-time branching process 
(CTBP), the splitting process or the Bellman-Harris process as well as various associated pro- 
cesses. 

Define the size-biased distribution F* of the random variable D with distribution function F by 

F*{x) = E[Dl{D^,y]/E[D], x£R. (1.11) 
Now let (BP*(t))i>o denote the following CTBP: 

(a) At time t = 0, we start with one individual which we shall refer to as the original ancestor or 
the root of the branching process. Generate D* having the size-biased distribution F* given in 
(1.11). This individual immediately dies giving rise to D* — 1 children. 

(b) Each new individual v in the branching process lives for a random amount of time which has 
distribution G, i.e., the edge weight distribution, and then dies. At the time of death again the 
individual gives birth to D* — 1 children, where D* ~ F*. Lifetimes and number of offspring 
across individuals are independent. 

Note that in the above construction, by Condition 1.1(b), if we let = D* — 1 be the number 
of children of an individual then the expected number of children satisfies 

E[Xy] = K[Dl -l] = iy>l, (1.12) 

Further, by Condition 1.1(c), for D* F*, 

E[D*log{D*)+]<oo. (1.13) 



The CTBP defined above is a splitting process, with lifetime distribution G and offspring dis- 
tribution D* — 1. Denote by N{t) the number of offspring of an individual at time t. Then 
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= E[A^(t)] = V ■ G{t), where v = K[D* — 1]. Furthermore, the Laplace-Stieltjes transform fi 
is defined by 

;>oo 

fl{s) = iy e-'^dG{t). 
Jo 

The Malthusian parameter a of the branching process BP*(-) is the unique solution of the equation 

/•oo 

fL{a) = u e~'^^dG{t) = 1. (1.14) 

JO 

Since > 1, we obtain that a € (0,oo). We also let a„ be the solution to (1-14) with v 
replaced with z/„ = E[L'„(Z)„ — 1)]/E[D„]. Clearly, a„ — )• a, when Condition 1.1 holds, and 
|a„ -a\= 0{\vn - 

Standard theory (see e.g., [6, 33, 34]) implies that under our assumptions of the model, namely 
(1.12) and (1.13), there exists a random variable W* such that 

e-"*|BP*(t)| W*. (1.15) 

Here the limiting random variable W* satisfies > a.s. on the event of non-extinction of the 
branching process and is zero otherwise. Thus a measures the true rate of exponential growth of 
the branching process. 

Define the distribution function G, which is often referred to as the stable-age distribution, by 

G{x) = v I e-'^ydGiy), (1.16) 
Jo 

where we recall that a is the Malthusian rate of growth parameter. Let u be the mean and a'^ 
the variance of G. Then z^jCT^ E (0, oo), since a > 0. We also define Gn to be the distribution 
function Gn in (1.16) with u and a replaced with f„ and a„, and we let Un and be its mean 
and variance. 

We need a small variation of the above branching process where the root of the branching process 
dies immediately giving birth to a D number of children where D has distribution F. The 
details for every other individual in this branching process remain unchanged from the original 
description, namely each individual survives for a random amount of time with distribution G 
giving rise to a — 1 number of children where D* ~ F*, the size-biased distribution function 
F* . Writing |BP(t)| for the number of alive individuals at time t, it is easy to see here as well 
that ^ 

e-°*|BP(t)r^ W. (1.17) 
Here, VV satisfies the stochastic equation, 

D 

>V = ^>V*'<''e-'^^S 

i=l 

where D ^ F, and >V*'''' are i.i.d. with the distribution of the limiting random variable in (1.15), 
and are i.i.d. with distribution G. Let 

W = W\W>Q. (1.18) 

To simplify notation in the sequel, we will use (BP(t))(>o to denote a CTBP with the root having 
offspring either D or D* — 1, which will be clear from the context. 

We are now in a position to identify the limiting random variable Q as well as the parameters 
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Theorem 1.3 (Identification of the limiting variables) The parameters a, an, (3, jn in The- 
orem 1.2 satisfy that a is the Malthusian rate of growth defined in (1.14) and an is the solution 
to (1-14) with Vn replacing v , while 



^2 



7n = ^, /3 = ^- (1-19) 

anVn V-^a 



Further, Q can be identified as: 



Q = - (- log W'^' - log W'"' - A + c) , (1.20) 
a 

where P(A < x) = ^ , so that A is a standard Gumhel random variable, W'^^ , W'^^ are two 
independent copies of the variable W in (1.18), also independent from A, and c is the constant 

c = \og{^i{v-lf/{ua9)). (1.21) 



Remark 1.4 (Asymptotic mean) We can replace an and jn by their limits a and 7 = 1/(qP) 
in (1.10) precisely when 7n, = 7 + o{l/\/logn) and an = a + o(l/logn). Since |a„ — a\ = 
0(|f„ — i>\), \j?n — i>\ = 0(|fn — these conditions are equivalent to Un = v + o{l/ ^J\og n) and 
i'n = 1^ + 0(1/ log n), respectively. 



Theorem 1.3 implies that also the random variable Q is remarkably universal, in the sense that it 
always involves two martingale limit variables corresponding to the flow problem, and a Gumbel 
distribution. While such results were known for the exponential distribution (see e.g., [9]), this 
is the first time that FPP on random graphs with general edge weights is studied. 

Let Ln{i) denote the weight of the shortest path, so that L„ = Ln(l); and let Hn{i) denote 
its length. Further let Hn{i) and Z„(i) denote the re-centered and normalized quantities as in 
Theorem 1.2. The same proof for the optimal path easily extends to prove asymptotic results for 
the joint distribution of the weights and hopcount of these ranked paths. To keep the study to a 
manageable length, we shall skip a proof of this easy extension. 

Theorem 1.5 (Multiple paths) Under the conditions of Theorem 1.2, for every m > 1, 

{{Hn{i),Ln{i)))i<^[m] {{Zi,Q-i))i(i[m], (1-22) 

as n ^ oo, where for i > 1, Zi and Qi are independent and Zi has a standard normal distribution, 
while 

Qi = - (- log W'^' - log - Ai + c) , (1.23) 
a 

where (Aj)j>i are the ordered points of an inhomogeneous Poisson point process with intensity 
A(t)=e*. 



1.4 Related random graph models 

Uniform random graphs with a prescribed degree sequence. We call a graph simple 
when it contains no self- loops nor multiple edges. It is well known that the CM conditioned on 
being simple is a uniform random graph with the same degrees. As a result, our theorems extend 
to this setting: 
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Theorem 1.6 (Extension to uniform random graphs with prescribed degrees) Under 
the conditions of Theorem 1.2, the results in Theorems 1.2 and the identification of the limiting 
variables in Theorem 1.3 apply to uniform random graphs with prescribed degree sequence d 
satisfying Condition 1.1. 

The proof of Theorem 1.6 follows rather directly from that of Theorems 1.2-1.3, by conditioning 
on simplicity. By [13] or [37], under Condition 1.1, 

lim P(CM„(d) simple) = e~^/'^-^V^_ (1.24) 

n— ^-oo 

This proof of (1.24) follows by a Poisson approximation on the number of self-loops and the 
number of multiple edges, which are proved to converge to two independent Poisson random 
variables with means v/2 and /A respectively. We can interpret the probability in (1.24) as the 
probability that both these Poisson variables are equal to zero, which is equivalent to the graph 
being simple. Now the proof of the main theorem reveals that in order to find the minimal weight 
path between vertices Ui,U2, we only need to investigate of order ^/n edges. Therefore, the event 
of simplicity of the configuration model will be mainly determined by the uninspected edges, and 
is therefore asymptotically independent of {Hn,Ln). This explains Theorem 1.6. We give a full 
proof of Theorem 1.6 in Section 6. 



Rank-1 inhomogeneous random graphs. Fix a sequence of positive weights {wi)i^[n]- We 
shall assume that there exists a distribution function F^y on M"*" such that 

Fn,w{^) = '^{m<x} Fw{x), (1.25) 

ie[n] 

for each point of continuity of F^^. Here, 1a denotes the indicator of the set A. Let Wn denote 
the weight of a uniformly chosen vertex in [n], i.e., Wn = wy, where V G [n] is chosen uniformly. 
Let W* denote the size-biased version of Wn, i.e., 

nw: <x) = nwni{Wr.<x}]inwn]- (1.26) 



Now given these weights, we construct a random graph by attaching an edge between vertex i 
and j with probability 

Pij- = 1 -e-"'^"'^/^", (1.27) 

where, with some abuse of notation, 

ie[n] 

is the sum of the vertex weights, and the status of different edges are independent. Let v = 
E[VF^]/E[VF]. We always assume > 1 as this is necessary and sufficient for the existence of 
a giant component (see, e.g., [14]). Note that letting Wi = A, we immediately get the Erdos- 
Renyi random graph with edge connection probability 1 — e"^/". Thus, this model is a natural 
generalization of the classical random graph model. Related models are the generalized random 
graph introduced by Britton, Deijfen and Martin-L6f in [16], for which 

Pij = , ' ' , 1-29 

in + WiWj 

and the random graph with given prescribed degrees or Chung-Lu model, where instead 

Pij = max{wiWj/in, 1), (1.30) 
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and which has been studied intensively by Chung and Lu (see [18, 19, 20, 21, 22]). Let W denote a 
random variable with distribution Fyy. By Janson [38], when W„ W and IE[W^] — ^ E[14/^^], the 
three random graph models defined above are asymptotically equivalent, meaning that all events 
have asymptotically equal probabilities. By [14], with Nk{n) denoting the number of vertices 
with degree k, 

-w 



Nk{n)/n 



E 



k\ 



1.31 



This proves that Dn — > D, where D has the mixed Poisson distribution given in (1.31), i.e., for 
this model 

-w ' 



Fix) 



k<x 



(1.32) 



This is formulated in the following theorem: 



Theorem 1.7 (Extension to rank-1 inhomogeneous random graphs) Forrank-1 inhomo- 
geneous random graphs with edge probabilities in (1.27), (1.29) or (1.30), where the weight of a 
uniform vertex Wn satisfies that Wn W, E[W„] E[W],E[W^] E[Ty2] and, for every 
Kn oo, 

E[W^l0g{Wn/Kn)+]=0{l), 

the results in Theorems 1.2 and 1.3 hold with limiting degree distribution F in (1.32). 



Theorem 1.7 can be understood as follows. By [16], the generalized random graph conditioned on 
its degree sequence is a uniform random graph with the same degree sequence. Therefore, Theorem 
1.7 follows from Theorem 1.6 when the conditions on the degrees in Condition 1.1 hold in probabil- 
ity. By (1.31), Dn — — )• -D, where D has the mixed Poisson distribution given in (1.31). Therefore, 
it suffices to show that E[W„] ^ E[VF], E[VF2] ^ E[W^] and \{mn^oo^[Wl\og{Wn/ Kn)^\ = 
imply that the same convergence holds for the degree of a random vertex. This is proved in 
Section 6. 



1.5 Examples 

In this section, we discuss a few special examples of the edge weights that have arisen in a number 
of different contexts in the literature and have been treated via distribution specific techniques. 



Exponential weights. FPP on random graphs with exponential edge weights have received 
substantial attention in the literature (see e.g., [5, 9, 10, 11]). Let G{x) = 1 — e~^, for x > 0, 
denote the distribution function of an exponential random variable with mean 1. This was one of 
the first models to be formulated and analyzed in the context of the integer lattice, see [50] and 
the complete analysis in [28]. For exponential weights, the Malthusian rate of growth parameter 
a satisfies 

/■oo /»oo 

u / e-"^dG(x) = V / e-"^e-^'dx = 1, (1.33) 
Jo Jo 

so that a = — 1, a„ = f„ — 1. Similarly, one can compute that Gn{x) = 1 — e~'^"^, x > 0, so 
that 

Using these values in Theorem 1.3 shows that Hn converges to a normal distribution, with asymp- 
totic mean and asymptotic variance both equal to log n. Finally c = log ( fi{i' — 1)^/ (aui?) ) = 
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log(/i(z^ — 1)), which is equal to the constant in [9, (3.7)]. ^ ^ This result thus generalizes the 
hopcount result in [9], where FPP with exponential weights is considered on the configuration 
model with i.i.d. degrees with D > 2 a.s. In [9], also infinite variance degrees are studied, a case 
that we do not investigate here. In fact, some of the results we prove here do not extend to this 
setting, see Section 1.6 for more details. 



Exponential weights plus a large constant. We next study what happens when Xe = 
1 + Ee/k, where (Ee) are i.i.d. exponentials with mean 1, and A; is a large constant. This setting 
is, apart from a trivial time-rescaling, identical to the setting where = k + E^. In this case, one 
would expect that for large k, Hn is close to the graph distance between a pair of uniformly chosen 
vertices in [n], conditioned to be connected. This problem has attracted considerable attention, 
see, in particular, [25] for the Norros-Reittu model and [30] for the CM. In these works, it has been 
shown that {Hn — logj^ (n))„>i is a tight sequence of random variables. This suggests (compare 
with Theorems 1.2-1.3) that, as /c — ?■ oo, 

- 2 

a^logv, V ^ I, ^ >{). (1.35) 

We now check this intuitive argument. Indeed, 

"^(iG(x) = vk / e-^^e-'^^^-^^drc = e"" = 1. (1.36) 

J I a + k 







While solving this equation explicitly is hard, it is not too difficult to see that — t- oo implies 
that a — )• log v. 

We can compute the stable-age distribution as 1 + Exp(A; -|- a), so that 9 = 1 + while 
= l/{k + a)^ — )• 0. Therefore, P ~ 1, which in turn also implies that au — )• logv. Also, 

-2 

^~A;-2(logzy)-i^0. (1.37) 

This shows that the two settings of graph distances and FPP with weights 1 -|- Exp(l)/A; match 
up nicely when A; — )■ oo. 



Weak disorder on random regular graph w^ith large degree. As a third example we 
consider the configuration model with fixed degrees r, and where each edge is given an edge 
weight E^, s > where E ~ Exp(l). The parameter s plays the role of inverse temperature 
in statistical physics with s — )• oo corresponding to the minimal spanning tree with exponential 
edge weights. This setting has been studied on the complete graph in [8], and here we make the 
connection to the results proved there. 

In this case, = r — 1. The Malthusian parameter a satisfies (compare (1.14)), 

/>oo 

(r - l)p / e-^^-^^x^"^ dx = 1, (1.38) 
Jo 

where p = 1/s. We can not solve (1.38) explicitly, but when r — )• oo, we conclude that a — )• oo, 
so that the above equation is close to 

l-OO 

rp / e-"^xP-^ dx = rpa-Pr{p) = ra~PT{p + 1) = 1, (1.39) 
Jo 

^In [9, (3.7)], the Gumbel variable A, which appears as log A4" in [9, (C.19)], where M is an exponential random 
variable, should be replaced with —A. 

^Also in [9, (5.4)] there is an error in the precise limiting random variable for FPP on the Erdos-Renyi random 
graph due to the fact that [9, (4.16)] is not correct. 
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so that 

a = {rr{l + l/s)y. (1.40) 
The moments !> and are then approximately given by 

v^rpj e-°^xPdx = rpa~(P+i)r(p + l) =p/a, (1.41) 
Jo 



and 

;>oo 

a'^ ^rp / e-"^xP+^ dx - = rpa-^P+^'^T{p + 2) - (p/q)^ = p/a^ (1.42) 

JO 

where we repeatedly use (1.40). As a result, we obtain 

7 = ^ ~ 1/p = s, /3 = ^ ~ 1// = s\ (1.43) 

These results match up nicely with the result on the complete graph, obtained when r = n — 1, 
for which [8] show that a central limit theorem holds for Hn with asymptotic mean slogn and 
asymptotic variance s^logn, while n*[L„ — ^logn] converges in distribution, where A = r(l + 

i/sy. 

1.6 Discussion 

In this section, we discuss our results, possible extensions and open problems. 

(a) Universality. As Theorems 1.2-1.3 show, even second order asymptotics for the hopcount 
in the presence of disorder in the network depends only on the first two moments of the size- 
biased offspring distribution and on the edge-weight distribution, but not on any other property 
of the network model. Further, the limit distribution of the minimal weight between two uniform 
vertices conditioned on being connected has a universal shape, even though the martingale limit 
of the flow naturally strongly depends both on the graph topology as well as on the edge weight 
distribution. 

(b) Divergence from the mean-field setting. One famous model that has witnessed an 
enormous amount of interest in probabilistic combinatorial optimization is the mean-field model, 
where one starts with the complete graph on n vertices and then attaches random edge weights 
and analyzes optimal path structure. See [36] for a study on the effect of exponential edge weights 
on the geometry of the graph, [27] for study of the minimal spanning tree and [3] for a discussion 
of a number of other problems. Branching process methods have been used to good effect in 
this setting as well to analyze the effect of random disorder on the geometry of the graph, see 
[8] for example and often give good heuristics for what one would expect in the sparse random 
graph setting. However, what the main theorems imply is that in a number of situations, the 
mean-field setting diverges markedly from the random graph setting. For example, when each 
edge has weight where E has an exponential distribution and s > 0, one can show that in 
fact the hopcount between typical vertices converges to a constant [9], while Theorem 1.3 implies 
that even in this case, for the CM, the hopcount scales as logn and satisfies a CLT. 

(c) Infinite variance configuration model. In [10], we have also investigated the setting 
where the degrees are i.i.d. with E[L'2] = oo and with exponential edge weights. In this case, 
the result for L„ is markedly different, in the sense that L„ converges in distribution without 
re-centering. Further, Hn satisfies a central limit with asymptotic mean and variance equal to a 
multiple of logn. Now, when taking ~ 1 + Exp(l), by [31], there are paths of length log logn 
connecting vertices Ui and U2, conditionally on Ui and U2 being connected. Since the weight of 
such a path is of the same order, we conclude that Hn, Ln = Gp(loglogn). Thus, in such cases it 
is possible that Hn acts on a different scale, even though Hn — > 00. It would be of interest to 
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investigate whether Hn always satisfies a central limit theorem, and, if so, whether the order of 
magnitude of its variance is always equal to that of its mean. 

(d) The X log X-condition. By Condition 1.1(c), we assume that the degrees satisfy a second 
moment condition with an additional logarithmic factor. This is equivalent to the CTBP satisfying 
an X log X condition (uniformly in the size n of the graph). It would be of interest to investigate 
what happens when this condition fails. It is well known that the branching process martingale 
limit is identically equal to when K[X] < oo, but E[X log (X)_(_] = oo (see e.g., [6] or [33], or 
[46]). Therefore, the limit in (1.20) does not exist. This suggests that (1.10) should be replaced 
with Ln — ^log(t„), where tn is such that |BP*(t„)|n~^/^ has a non-trivial limit. 

(e) Flooding and diameter. In [5], the flooding time and diameter, i.e., max^gj^j . ^.^^ ^<oo ^Uij: 
respecively maxj Li j<oo -^i,] 1 where Li^j is the minimal weight between the vertices i and j 
and Ui is, as before, a randomly selected vertex, is investigated in the context of the CM with 
exponential edge weights. It would be of interest to investigate the flooding time for general edge 
weights. We expect that the exponential distribution is special, since there the typical weight 
has the same order of magnitude as the maximum over the vertices of the minimal edge weight 
from that vertex. This fact is only true when the weight distribution has an exponential tail. For 
example, taking X(, = for s > 1, the maximal minimal weight from a vertex is of order (log nY , 
which is much larger than the typical distance, which is Gp(logn) due to our main results. It 
would be interesting to investigate what the limit of the weight diameter is in this simple example. 

(f) Superconcentration and chaos. In this study we have looked at some global functionals 
of the optimal path between randomly selected vertices and in particular have shown that the 
weight of the optimal path satisfies -L„/logn = Op(l). Analogous to various related problems 
in statistical physics such as random polymers, or FPP on the lattice, this suggests that the 
optimal path problem satisfies superconcentration. In particular, it suggests that this random 
combinatorial optimization problem is chaotic in the sense that there exists e„ — )• such that 
refreshing a fraction e„ of the edge weights with new random variables with the same distribution 
would entirely change the actual optimal path, in the sense that the new optimal path would be 
"almost" disjoint of the original optimal path, see e.g. [17]. Such questions have also arisen in 
computer science wherein one is interested in judging the "importance" and fair price of various 
edges in the optimal path; if an edge being deleted causes a large change in the cost of the new 
optimal path, then that edge is deemed very valuable. These form the basis of various "truth 
and auction mechanisms" in computer science (see e.g. [47], [26], [7]). It would be interesting to 
derive rigorous results in our present context. 

(g) Pandemics, gossip and other models of diffusion: First passage percolation models 
as well as models using FPP as a building block have started to play an increasingly central 
role in the applied probability community in describing the flow of materials, ranging from viral 
epidemics ([23]), gossip algorithms ([4]) and more general finite Markov interchange processes 
([2]). Models with more general edge distributions have also arisen in understanding the flow of 
information and reconstruction of such information networks in sociology and computer science, 
see e.g. [43], [44] for just some examples in this vast field. 

2 Proof: construction of the flow clusters 

We start with some central constructions that lay the ground work for the proofs of the main 
results. We denote by Ui and U2 two randomly selected vertices, conditioned on being connected. 
We think of the weights as edge lengths so that they induce a random metric on the graph CM„((i). 
For a half-edge y, we let Py denote the half-edge to which it is paired, i.e., {y,Py) forms an edge. 
Further, we let Vy be the vertex to which the half-edge y is incident. 



12 



2.1 Flow clusters from Ui and U2 



To understand the shortest path between these vertices, think of water percolating through the 
network at rate one, started simultaneously from the two vertices. For any t > 0, the set of 
vertices first seen by the flow from Ui will often referred to the flow cluster or the shortest weight 
graph of vertex Ui. When the two flows collide or create prospective collision edges, then these 

generate prospective shortest paths. 

Let us now give a precise mathematical formulation to the above description. We grow two flow 
clusters (i.e. two stochastic processes in continuous time) from Ui and U2, simultaneously. The 
main ingredients of the two flow clusters, namely the alive set A{t) will only change at random 
times To = < Ti < T2 < ... and therefore the definition can be given recursively. At time 
t = To = 0, the vertices Ui and U2 die instantaneously, and give rise to dui and children. 
These children correspond to half-edges incident to Ui and U2- We start by testing whether any 
of the half-edges incident to Ui are paired to one another. If so, then we remove both half-edges 
from the total set of du^ half-edges. We then define Xq^ the number of unpaired half-edges after 
the self- loops incident to C/i are removed. We next continue with the du2 half-edges incident to 
U2, and check whether they are paired to one of the Xq^ remaining half-edges incident to C/i or 
any of the djj^ half-edges incident to U2 ■ When such a half-edge is paired to one of the du2 sibling 
half-edges, a self-loop is formed. When such a half-edge is paired to one of the X^" remaining 
half-edges incident to vertex Ui, a so-called collision edge is formed. A collision possibly yields 
the path with minimal weight between Ui and U2- We let X^^ denote the number of unpaired 
half-edges after the tests for collision edges and cycles have been performed. Note that, by 
construction, each of the Xq ^ half-edges incident to the vertices Ui, where i G {1,2}, are paired 
to new vertices, i.e., vertices distinct from Ui and U2- 

For the moment we collect the collision edges at time Tq, together with the weights of the 
connecting edge between Ui and U2, and continue with the description of the flow clusters. All 
edges that are not paired to one of the other dui + du^ — 1 half-edges incident to either Ui or U2 
together form the set A(0), the set of active half-edges at time 0. For y G A(0), we define I{y) = i 
if the half-edge y is connected to Ui, i = 1,2, and we define {Ro{y))yeA{o) ^ i.i.d. sequence of 
life times having distribution function G. 

We denote the set of half-edges at time t by A(i). For y G A(t), we record I(y), which is the 
index i G {1, 2} to which Ui the half-edge is connected, and we let H{y) denote the height of y to 
Uj^yy This height equals for y G A(0). When we introduce new half-edges at A{t) at later times 
we will specify the height of these half-edges. Now define Ti = minj^gA(o) Ro{y) and denote by 
the half-edge equal to the argument of this minimum, hence i?o(yo) ~ ^^^yeA{o) ^oiu)- Since 
life-times have a continuous distribution, y^ is a.s. unique. Now set A(t) = A(0), <t <Ti, i.e., 
the active set remains the same during the interval [0,Ti), and define the flow cluster SWG(t), 
for < i < Ti, by 

SWG(t) = {y,I{y),H{y),Rt{y)}y^A(t), (2-1) 
where I{y) and H{y) are defined above and Rt{y) = Ro{y) — t, < t < Ti, denotes the remaining 
lifetime of half-edge y. This concludes the initial step in the recursion, where we defined A{t) and 
SWG(t) during the random interval [ro,Ti). 

We continue using induction, by defining A(t) and SWG(t) during the random interval [T^, T^+i), 
given that the processes are defined on [0,Tfc). At time t = T^, we remove from the set 
A(i— ). By construction, we know that Zk = Ryl_i ^ so that F^^. is not a vertex that has 

been reached by the flow at time t. Then, for each of the dy^^ — 1 other half-edges incident to 
vertex Vz,, we test whether it is part of a self-loop or paired to a half-edge from the set A(t— ). 
All half-edges incident to that are part of a self-loop or incident to A(t— ) are removed from 
vertex V^^; we also remove the involved half-edges from the set A(t—) . We will discuss the role 
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of the half-edges incident to V^^. that are paired to half-edges in A(t— ) in more detail below. 

For all the remaining siblings of we do the following: Let x be one such half-edge of Vz^. , then 
X is added to l^iTk)^ I{x) = H{x) = H{y^_-^) + 1, while Rt^.{x) is an i.i.d. life time with 

distribution G. We now set A{t) = A(Tjt), Tf^. < t < T^+i, where T^-^-i = + miiiy^/^(^j-^^ RT^{y), 
and where the minimizing half-edge is called y^. Furthermore, for t G [rfc,rfc_|_i), we can define 
SWG(t) by (2.1), where Rt{y) = Rxkiy) — (t — Tk). Finally, we denote the number of the dy^^ — 1 
other half-edges incident to vertex Vz^ that do not form a self-loop and that are not paired to a 
half-edge from the set A(i— ) by X^. Later, it will also be convenient to introduce = dy^^ — 1- 
Let Sk = |A(rfc)|, so that 5o = Xq '' + Xq \ while Sk satisfies the recursion 

Sk = Sk-i + Xk — I. (2.2) 

This describes the evolution of (SWG(t))t>o. 



Cycle edges and collision edges. At the times T^, k > 1, we find the half-edge y^_i which 
is paired to = Pyi_^, and for each of the other half-edges x incident to 14^., we check whether 
or not Px G A{Tf^—). The half-edges paired to alive half-edges in A(Tfc— ) are special. Indeed, 
the edge {x,Px) creates a cycle when /(x) = I{Px) while {x,Px) completes a path between Ui 
and U2, when I{x) = 3 — I{Px)- Precisely the latter edges can create the shortest- weight path 
between Ui,U2- Let us describe these collision edges in more detail. 

At time Tjt and when we create a collision edge consisting of and Px,. , then we record 

({n,I{z,),H{z,), HiPx,),RT, (P. j) . (2.3) 
\ / fc>0 

It is possible that multiple half-edges incident to Vz^ create collision edges, and if so, we collect 
all of them in the list in (2.3). In this definition it is tempting to write I{xk) and H{xk), but 
note that ^ f^{Tk), whereas its sibbling half-edge Zk G /^{Tk)-, and, moreover, Xk and Zk have 
the same ancestor and the same height. With some abuse of notation we denote the ith collision 
edge by (xjjP^.); here Px^ is an alive half-edge and Xi the half-edge which pairs to P^;. ; further 
Zi is the sibling of Xi paired with the minimal edge y* found by the flow. Let T^'™'^ be the time 
of creation of the ith collision edge. The weight of the (unique) path between Ui and U2 that 
passes through the edge consisting of Xj and P^. equals IT-""^^ -|- P^{coi) (PrJ, so that the shortest 

i 

weight equals: 

Ln = min[2Tt°'' + P^(coi) (P. J] . (2.4) 
Let /* denote the minimizer of i 1— ?■ 2T-''°^^ + P (coi)(PxJ, then 

i 

Hn = H{zi.) + H{Px^,) + l. (2.5) 
Of course, (2.4) and (2.5) need a proof, which we give now. 

Proof that Ln given by (2.4) yields the minimal weight. Observe that each path between Ui and 
U2 has a weight L that can be written in the form 2Tj + Rx^iPxi) for some i > 0. Indeed, let 
(io = Ui,ii,i2, ■ ■ ■ ,ik = U2) form a path with weight L, and denote the weight on ij-iij by Xe^ 
for 1 < j < k. For k = 1, we obviously find = 2Tq + X^^^. For general A: > 1, take the 
maximal j > such that Xf^^ + • • • + X^.^ < L/2. Then, we write 

^ ^ f 2Ei=l ^e. + [EL,+1 Xe. - Ei=l ^ej, when ELl ^e. < Es=,+1 ^e. , 
I 2E' •+l^e. + [ELl^e.-E' +1^^^]' ^hen ELl^e. >E' •+l^e., 
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which in either case is of the form L = 2Tm + RTmiu)^ fo'^ some m > and some half-edge y. 
Note that in the construction of the flow clusters, instead of putting weight on the edges, we have 
given weights to half-edges instead. In the representation (2.4) full edge weight is given to the 
active half-edges and weight to the ones with which they are paired. When the collision edge 
has been found we give the full weight to the parent-edge P^- So, in fact, by the redistribution of 
the weights in (2.4) is an equality in distribution. This completes the proof of the claim. ■ 

Remark 2.1 (On the number of collision edges) We do not have to find all collision edges. 
The recursion can he stopped when > L/2 for some k > 1, where L is the weight of one the 
collision edges found previously. This is immediately clear, since all collision edges found at T^ 
or later have weight exceeding 2Tfc > L. 



2.2 Main result: Poisson Point Process limit 



In this section, we state our main result, which will imply Theorems 1.2- 1.3. 



Basic constructions and properties. To state our main technical result concerning the 
appearance of collision edges, we need to define some new constructs. We start by defining a 
rescaled version of the point process corresponding to the points in (2.3). Let us first setup some 
notation. For i G {1,2} and t > 0, we let 

|SWG(t)| = #{y G A(t)}, |SWG«(i)l = #{y e A(t) : I{y) = i}, (2.6) 

be the number of alive half-edges at time t, as well as those that are closest to vertex i. By 
construction, since we check whether the half-edges form a cycle or a collision edge when the 
half-edges are born, SWG'^'(t) and S\NG^^\t) are disjoint. Consider the filtration {J-s)s>o with 
J^s = '7((SWG(t))fg[o_s] denoting the sigma-algebra generated by the shortest-weight graph up to 
time s. 

Fix a deterministic sequence s„ — >• oo that will be chosen later on. Now let 
where, for s > 0, 

y^(0 =e-"n.|SWGW(s)|. (2.8) 

Note that e°"*" = y/n, so that at time t„, both |SWG''^(s)| are of order ^/n■, consequently 
the variable tn denotes the typical time at which collision edges start appearing, and the time 
in incorporates for stochastic fluctuations in the size of the SWGs. The precise rate at which 
— )• oo for asymptotic properties of the construction to hold is determined in the proof of 
Proposition 2.4 below. In particular we choose — )• cxd such that SWG''^(t) for t < Sn can 
be coupled with two independent two-stage branching processes BP''^(t) such that with high 
probability {BP(t) = |SWG(t)|} Vt < s„. 

Define the residual life-time distribution to have density given by 

_ f;^e-yg{x + y)dy 
^''^ I,^e--y[l-G{y)]dy- ^"''^ 

Recall that the ith collision edge is given by {xi , . ) , where . is an alive half-edge and Xi the 
half-edge which pairs to Px^. In terms of the above definitions, we define 
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and write the random variables (Hj)j>i with Hj G M x {1, 2} x M x M x [0, oo), by 

= (7;'-'\l(x,),^r'.^r'>^T.(P.J). (2.11) 

Then, for sets A in the Borel cr— algebra of the space 5 := M x {1, 2} x M x M x [0, oo), we define 
the point process 

Un{A)=^6^M), (2.12) 

i>l 

where 6x gives measure 1 to the point x. Let Ai{S) denote the space of all simple locally 
finite point processes on S equipped with the vague topology (see e.g. [41]). On this space one 
can naturally define the notion of weak convergence of a sequence of random point processes 
n„ E A4(S). This is the notion of convergence referred to in the following theorem. In the 
theorem, we let $ denote the distribution function of a standard normal random variable. 

Theorem 2.2 (PPP limit of collision edges) Consider the distribution of the point process 
Un e M{S) defined in (2.12) conditional on (SWG(s))sg[o,s„] such that Ws^' > and Wsf? > 0. 
Then n„ converges in distribution as n ^ oo to a Poisson Point Process (PPP) U with intensity 
measure 

X(dt xixdxxdyxdr) = ^^-^^(Q) e^°*dt (g) {1/2, 1/2} (g, <^(dx) <^(dy) (g) FJdr). (2.13) 



Completion of the proof of Theorems 1.2, 1.3 and 1.5. Let us now prove Theorem 1.2 
subject to Theorem 2.2. First of all, by (2.10), (2.4) and (2.5) and Remark 2.1, 



H„ ^ log n 1 

in-— logn), (2.14) 



1 



i- log n '^''^ 

is a continuous function of the point process n„, and, therefore, by the continuous mapping 
theorem, the above random variable converges in distribution to some limiting random variables 
(Z,Q). 

Recall that P denotes the minimizer of i i— )• 2Tj'™'^ + i?^{coi) (-PrJ- By (2.4), the weight L„ as well 
as the value of /*, are functions of the first and the last coordinates of n„. The hopcount Hn is 
a function of the third and the fourth, instead. By the product form of the intensity in (2.13), 
we obtain that the limits {Z,Q) are independent. Therefore, it suffices to study their marginals. 
The same observation applies to the multiple path problem in Theorem 1.5. 

We start with the limiting distribution of the hopcount. By (2.10), 

Hn ^ log n 1 , , , 1 , 

- = -V2H'f:^ + - V2i7 f ' + Op(l). (2.15) 
5-2 1 2 2 ^ 

St- log n 

By Theorem 2.2, the random variables {HjZ''\HjT^), converge to two independent standard nor- 
mals, so that also the left-hand side of (2.15) converges to a standard normal. 

The limiting distribution of the weight L„ is slightly more involved. By (2.7), (2.4) and (2.10), 
L„ - — log n = L„ - 2tn = Ln- 2t„ - — log( W^;^^ ) (2.16) 



— logiW^^X) + min[2i;^-'^ + i?^(co,) (P.,)] " 2t. 

— log(>VWWf„') + min[27;(-') +P^(co,)(P.J]. 
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By Proposition 2.4 below, (Wi^^ VVi^'') — > (W^^', W'^*), which are two independent copies of the 
random variable in (1.18). Hence, 

L„ - — logn A -ilog(W'W<=^0 +min[2P, + (2-17) 

an a i>i 

where {Pi)i>i form a PPP with intensity ^^^^^^e'^^^dt, and (iij)i>i are i.i.d. random variables 
with distribution function independently of {Pi)i>i- 

We next identify the distribution of M = minj>i[2Pj + Ri]. First, (2Pj)j>i forms a Poisson 
process with intensity '^•^^'•^^ e°*dt. According to [52, Example 3.3 on page 137] the point pro- 
cess (2Pj + Ri)i>i is a non-homogeneous Poisson process with mean-measure the convolution of 
oo, x] = '^Is^Q^t ^j-^j Hence P(M > x) equals the Poisson probability of 0, where 
the parameter of the Poisson distribution is {fi * Pr)(x), so that 

P(M >x) = exp{-^^M^e°^' H F^{z)e-'^^ dz]. (2.18) 

Jo 

Let A have a Gumbel distribution, i.e., P(A < x) = x G M, then 

P(-oA + h>x) = e-e^/''e-''/"_ (2.19) 

From the identity: 



/>oo 

/ Pfl(z)e-"^ dz = e^/''e-''/^ 



we conclude that if we take a = 1/a and b = —a^^ log ^(i//jj(0)//u) PR(z)e~"^ dz^ , then 

min(2P, + R,) = -a'^A - a'^ log(i//«(0)P/^), (2.20) 

with P = Jq°° Fji{z)e~°^^ dz. In the following lemma, we simplify these constants: 

Lemma 2.3 (The constant) The constants B = Fji{z)e~'^^ dz and /n(0) are given by 

B = v/{u-l), f^{Q)=a/{u-l). (2.21) 
Consequently, the constant c in the limit variable (1.20) equals 

c = - log{i^fn{0)B/fi) = log{fi{iy - lf/{aui?)). (2.22) 

Proof. We start by computing fniO), for which we note that by (2.9) and (1.14), 

Further, by partial integration, 

r°° rl -|ooi/"°° lliz-l 

/ e-^y[i-Giy)]dy= --e-"y[l-G{y)] -- e-^yg{y)dy = = ,(2.24) 

Jo la J y=o a Jq a av av 

where we again use (1.14). Combining both equalities yields /^(O) = ajiy — 1). 
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For 5, we again use partial integration, followed by the substitution of (2.9); this yields 

B = / F«(z)e-°^ dz = - fR{z)e-'^' dz (2.25) 
Jo " Jo 

= 7/ e-M e-^yg{y + z)dvdz, 

- ^ Jo Jo 



'0 JO 

by (2.24). The final integral can be computed using 



e "'1{.>0} / e~''yg{y + z)l^y>oydydz 

poo 1 /"OO 

= / sg{s)e-''' ds = - sG{ds) = v/u. (2.26) 
Jo ^ Jo 

This completes the proof of both Theorem 1.2 and Theorem 1.3, given Theorem 2.2. h 
2.3 Overview of the proof of Theorem 2.2 

In this section, we reduce the proof of Theorem 2.2 to two main propositions. Recall the shortest 
weight graph or flow cluster SWG(f) defined in the previous section as well as the associated 
filtration {Ft)t>o- We shall couple these flow clusters from two points with (BP(t))f>o where 
BP(t) = (BP<''(t), BP<''^(t)) are two independent CTBPs starting with offspring distribution D. 
For a prescribed such coupling of (SWG(t))t>o and (BP(t))t>o, we let SWG(t)ABP(t) denote the 
set of miscoupled half-edges at time t. Then we prove the following limiting result: 

Proposition 2.4 (Coupling the SWG to a BP) (a) There exists Sn ^ 00 and a coupling of 
(SWG(s))s>o and (BP(s))s>o such that 

p((SWG(s)),g[o,,„] = (BP(.)),g[o,.„]) = 1 - 0(1). (2.27) 

Consequently, withW^^ = e-""'*"|SWG(''(s„)|, 

liminfliminfpfwf' G [e,l/d,Wf' G [e,l/e] > 0,Wf' > o) = 1. (2.28) 

(b) There exists a coupling of (SWG(s))s>o and (BP(„)(s))s>o, o.nd sequences En = o(l) and 
Bn — ?• 00 such that, conditionally on J-s„, 

S\NGitr, + 5„)ABP(„)(t„ + B„)\ > e„^ | ^ 0, (2.29) 

where (BP(„,(t)t>o = (BP^ (t), BP|^')(t))t>o and (BP« (s)),>,„, (BP[^,')(s)),>,„ are two indepen- 
dent two-stage Bellman- Harris processes with offspring — 1 (where has the sixe-biased 
distribution F* of Fn, see (l.ll)j for every individual, and edge weights with continuous distri- 
bution function G, and starting at time Sn in BP(sn) from part (a), respectively. 

The proof of Proposition 2.4 is deferred to Section 4. In the sequel, we shall assume that P 
denotes the coupling measure from Proposition 2.4. In particular, this yields a coupling between 
CM„(<i) for different n> 1, as well as a coupling between CM„(<i) and the n-dependent branching 
processes (BP(s))s>o. Under this coupling law, we can speak of convergence in probability, and 
we shall frequently do this in the sequel. 
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For i G {1, 2}, /c > 0, and t > 0, we define 

|SWG«[t,t + s)| = #{y G A(t): I{y) = i, H{y) = k, Rt{y) G [0,s)}, (2.30) 

as the number of alive half-edges at time t that (a) are in the SWG of vertex Ui, (b) have height 
k, and (c) have remaining lifetime at most s. We further write 

\S\NG%[t,t + s)\=#{ye/\{t): I{y) = i,H{y) < k,Rt{y) G [0,s)}, (2.31) 

for the number of vertices that have height at most k. To formulate the CLT for the height of 
vertices, we will choose 

kt{x) = ^ + x\ t^. (2.32) 
Finally, for a half-edge y G A(t), we let X* = — 1. 

Proposition 2.5 (Ages and heights in SWG) Fix j G {1,2}, x,y,t G M, si,S2 > 0. Then, 

conditionally on Wi^'Wi^' > 0, 

(a) 

e-'""*1SWG^]^^(,)[t„ + t,in + t + Si)\\S\NG%;l^y-^[tn +t,tn + t + S2)\ (2.33) 

^e2"*$(x)$(y)F«(si)F«(.2), 

(b) 

j.e2"*$(x)<l.(y)F^(si)F«(52). 

The first assertion in the above proposition follows from [53, Theorem 1(b)] in the case that our 
CTBP has finite-variance offspring. The proof of Proposition 2.5 is deferred to Section 5. 

Completion of the proof of Theorem 2.2. Recall that Tt = ^((SV\/G(s))sg[o,f]). We will 
investigate the number of collision edges (xi,PrJ with I(xj) = j G {1,2}, H{xi) < kt^{x), 
H{Pxi) < kt„{y) and i?^(coi) (P^. J G [0, s) created in the time interval [t„ + t,in + t + e), where 

i 

e > is small. We let X = [a, b) x {j} x (— oo, x] x (— oo, y] x [0, s] be a subset of 5, and we prove 
that ^ 

P(n„(X) = I J-,J ^ exp { - / ^^lM2)e2"*$(x)<I>(y)F«(s)dt}. (2.35) 

By [41, Theorem 4.7], this proves the claim. 
We split 

N 

X=[jTt\ (2.36) 

i=i 

where ^ = [tf_^,tf) x {j} x (-oo, x] x (-co, y] x [0, s), with = a + le and e = {b - a)/N, 
with G N. We will let e | later on. For a fixed e > 0, we say that a collision edge {xi,Px^) 
is a /jrsi round collision edge when there exists j G [N] and a half-edge y G A(t|^-^) such that 
y is found by the flow in the time interval y is paired to the half-edge Py whose sibling 
half-edge Xi is paired to Px^ G A(t^'^^) with I{y) = j i= I{Pxi) = 3 — j. We call all other collision 
edges second round collision edges. Denote the point processes of first and second round collision 
edges by Tin^'' and n^^^\ so that n„ = n^'''^' -l-IIn'^'. The next two lemmas investigate the point 
processes Tin^'' and Iln'^': 
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Lemma 2.1 (PPP limit for the first round collision edges) For every s > 0, x,y € 
j G {1, 2}, e > and i e[N], as n-?- oo, 



P(nr'(^f ) = I ) ^ exp { - e^"''^-^Hx)<i>{y)F^is)F,ie)}. (2.37) 
_i 

Proof. The number of half-edges z E A(t„ + 4-i) that are found by the flow having I{z) = j 
and H{z) < kt„{x) is equal to 

|SWG^^i^^(,)[t„ + tf_^-tn + tf\ + e)\. (2.38) 

Fix such a half-edge z, and note that it is paired to Pz that has X* = dvp^ — 1 sibling half-edges. 
For each of these half-edges we test whether it is paired to a half-edge in A{in + 4-i) °^ 
Therefore, the total number of tests performed in the time interval [tf\,t^^^) is equal to 

By construction, we test whether these half-edges are paired to half-edges that are incident to 
the SWG or not. Each of these edges is paired to a half-edge w S A(t„ + tf\) with I{w) = 3 — j 
(and thus creating a collision edge) and H{w) < kt„{y) and R- (e) (w) £ [0, s) with probability 

equal to 

4-^(^) |SWGf;:'(.)[t"n + tfi, + + .)|. (2.40) 

Therefore, the expected number of first round collision edges (xijPx^) with I{xi) = j £ {1)2}, 
H{xi) < h^ix), H{Pxi) < h^iy) and R te) (Pxi) G [0, s) created in the time interval [t„ + 

4-15 + 4^') equals the product of the expressions in (2.39) and (2.40), and can be rewritten as 

1 2antn ( -Otntn \ ^ Y*T[ 

- o(n) r 4" ^-eSWG^^]^^^(^j[f„+4l)„f„+4=))}, 

^,-a„*„|SWG<X:\,)[t„ + 4-i> in + 4-1 + «)l) • (2-41) 

By Proposition 2.5, conditionally on (e) , and using that (in — o(n))~^e^""*" — t- /i^^, we 

find that (2.41), which represents the expected number of collision edges Xi with l{xi) = j € 
{1,2}, H{xi) < kt„{x), H{Px-) < kt„{y) and R- (e) (P^J £ [Oj-s) created in the time interval 

[^n + 4-1' + 4^')' converges in probability to: 

-e2°*$(x)$(y)F«(s)F^(e). (2.42) 

^J' 

Further, for e > 0, conditionally on , , the probability that none of the half-edges found in 
the time interval in between [tn + 4-i' + 4^') creates a collision edge is asymptotically equal to 

n (l - 4^^|SWGl,\-i.)[*"" + 'tlvtn + tf_, + (2.43) 



v£S\NG 



exp{ - -e2°*t'i$(x)$(y)F«(s)F«(e)}. 
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Lemma 2.2 (A bound on the second round collision edges) Forx,y G R, j G {1,2}, e > 
and £ £ [N], as n ^ oo, 



P(nrn:ff ) > 1 I ^A^) ) = Op(l)F«(e)G(e) 



(2.44) 



Proof. By analogous arguments as above, the expected number of second round collision edges 
is of order 

Op(l)e2"*$(x)$(y)F«(s)F«(e)G(e), (2.45) 

since one of the half-edges z that is found by the flow in the time interval [in + t^£\,tn + 4^') 
needs to satisfy that one of the dvp^ — 1 half-edges has weight at most e, and which, upon being 
found, needs to create a collision edge. ■ 



Now we are ready to complete the proof of Theorem 2.2. We use that 

N 

p(n„(x) = I =E[ll¥{Un{i'/^) = I ) I 

We start with the upper bound, for which we use that 

p(n„(xf ) = I ) < p(nr'(xf ) = I ) 

^ exp { - e2-4l\$(^)$(y)F^(s)F^(e)}, 

by Lemma 2.1. We conclude that 

P(n„(X) = I < Ejjjexpl -e2<-i$(x)$(y)F«(s)F«(e)} | J",, 

£=1 

N 

= exp{-^e2<-i$(x)$(y)F«(.)F«(e)} 



l-b 

exp{-/«(0) / e^"'^xMy)F^is)dt}, 

J a 



since lim^io -^h(^)/^ = /js(0), and the Riemann approximation 



TV 

e^e- 

e=i 



2at 



This proves the upper bound. 

For the lower bound, we instead bound 

N 

p(n4x) = 1 > E[llF{urHif) = 1 -^41)^) I 



N 

E[(5]p(nr(xf 



(e) 



A 1 I J-., 



i=l 



(2.46) 



(2.47) 



(2.48) 



(2.49) 



(2.50) 



The first term has already been dealt with, the second term is, by Lemma 2.2, bounded by 

N 



E 



Op(l)^F«(e)G(e)) Al I 



il) 



(2.51) 



i=i 



as e I 0, by dominated convergence, since Fj^^e) = efii{0){l + o(l)) and G(e) = o(l). 
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3 Height CLT and stable age for CTBP 



In this section, we set the stage for the proof of Proposition 2.5 for CTBPs, by investigating 
the first and second moment of particles of several types. We will make use of second moment 
methods similar to the ones in [53], but with a suitable truncation argument to circumvent the 
problem of infinite- variance offspring distributions. 

We take K > 1 large and define, for an appropriate E (0, 1) that will be determined later on, 

mi = Kri-\ (3.1) 

and investigate the Bellman-Harris process where each individual in generation i has offspring 
distribution (X A mi) instead of X, where X denotes the offspring of our CTBP. 

We denote the number of alive individuals in generation k at time t in the original branching 
process by |BP,fc(i)|, and let |BPfc[t, t + s)\ denote the number of alive individuals in generation 
k with residual lifetime at most s. We let |BPjj"'[t,f + s)| denote the number of individuals in 
generation k at time t and with remaining lifetime at most s of the truncated branching process. 
Define 

k oo 

|BPf,'[i,t + s)\ = ^\BPf[t,t + s)\, \BP^^^[t,t + s)\ = ^ IBPf + s)\. (3.2) 

j=0 j=0 

We also write |BP*™'(t)| = lim^-^oo |BP''^'[t, t + s)\. A key ingredient to the proof of Proposition 
2.5 is Proposition 3.1 below. In its statement, we also use r] = v e~'^°'^ dG{s), so that rj < 1 
since a is such that Jg e~"^ dG{s) = 1. 

Proposition 3.1 (First and second moment CLT) Choose rrii = Kr]~^ as in (3.1). Assume 
that the XlogX condition holds, i.e., K[Xlog{X)^] < oo where X is the random amount of 
offspring of our CTBP. Then with A = {v — \)/avi>, 
(a) 

lim e-"*EnBP(t)n = A, lim e-"*E[|BP(t)| - |BP<*'(t)|l = 0, (3.3) 

t->-oo '- f— >-oo L J 

(h) there exists a C > such that uniformly in t ^ oo, 

e-2"*E[|BP('"'(t)|2] < CK, (3.4) 

(c) 

hm e-^'E[\BP^Ziat,t + s)\] = A^x)F,{s), (3.5) 

t— ^OO _ IV ; 

where kt{x) is defined in (2.32). 

(d) The same results hold uniformly in n when t = tn = i^logn as in (2.7), a is replaced by 
an, K by Kn and the branching process offspring distribution Xn depends on n in such a way 
that Xn X, K[Xn] — > E[X] and limsup„E[X„log(X„/i^„)+] = 0, for any Kn — ^ oo. 

Proof. We start by proving Proposition 3.1(a). The first claim is proved in [33, 35]. We bound 
the first moment of the difference between the truncated and the original branching process. Let 
u be the expected offspring of the Bellman-Harris process, and let u'-^^ = E[Xl|x<mi}]i where 
rrii = Kr]~^. We compute that 

oo oo A; 

e-°*E[ J^[BPfc(t)| - [BPf (t)|]] = e-* Y.^u'' - J]'^'"'] [G*\t) - G<''+^\t)] , (3.6) 

k=0 k=0 i=l 

where G is the distribution function of the edge weights. In order to bound the differences 
— Y[i=i we rely on the following lemma: 
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Lemma 3.2 (Effect of truncation on expectation CTBP) Fix ry G (0, 1) and rrii = Krj 
if E[X log {X) ] < oo, then 



(i) 

^ (log(l/7?))-iE[Xlog(X/K)J =0^(1), (3.7) 

i=l 

where 0^(1) denotes a quantity that converges to zero as K ^ oo. 
Proof. Since < 1 for alH > 1, it is easily sliown by induction that 

4=1 1=1 i=l 

Now, using that v > 1, 

oo oo oo 

E (1 - V) ^ Y.^[XliX>m^}] = E Hm..<X}\ , (3.9) 



1=1 i=l i=l 



and we note that the number of i for which rrii = Kr] * < x is at most [log {x/K) / log (1/??)] V 0. 
Therefore 



1 - n V - - - (l°g(l/^))-'E[^log(X/i^)J, (3.10) 

i=l i=l 

which converges to zero when — )■ oo. ■ 
By Lemma 3.2 and (3.6), 

oo oo 

e-"*E[^[|BP,(t)| - IBPf (i)|]] = o,(l)e-"*E[^ |BPfc(t)|] = o,(l), (3.11) 

fc=0 k=0 

which completes the proof of Proposition 3.1(a). 

We continue with the proof of the second moment estimate in Proposition 3.1(b). We follow the 
proof in [53], keeping attention to the truncation. We introduce h as the generating function of 
X, and hj as the generating function of {X A rrij), i.e., 

h{s) = E[s^], hj{s) = E[s(^^™:'-)], (3.12) 

where rrij is given by (3.1). Parallel to calculations in the proof of [53, Lemma 4], 

E[|BP''^'|2] = /i;'(l)(E[|BP(™i^|])2 * G + /i;(l)E[|BP('"^i>|2] * G, (3.13) 

where rhi = [m2,m^, . . .), is fh with the first element removed. Transforming to 

|BP*"'(t)| =e-"*|BP(*'(t)|, (3.14) 

we obtain, by multiplying both sides of (3.13) by e~^"*, 

E[|BP^"^'|2] = !?^(E[|BP'"'^'|])2 * Q + !?^E[|BP""^^f] * G, (3.15) 



where 



G{t) = v f e-'^ydG{y), Q{t) = rj-^ [ e'^'y dG{y) = t]-^u [ Q-'^'^y dG{y), (3.16) 
Jq Jo Jo 



2ay 
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and where we recall that rj = e dG{y) < 1 and v = h'{l). Iteration of (3.15) yields 

oo 

E[|BP"^'|2] = ^6i---&j_ia,E[|BP'™^'|]2*G,_i*Q, (3.17) 

i=i 

where 

a,- = — , 0,- = — , (3.18) 

■' u V 

and where nij = (mj+i,mj+2, • • •). According to (3.11) the expectation E[|BP^"'^^(t„)|] has the 
same asymptotic behavior as E[BP(t„)|]. Hence, by [53, Lemma 1(a)] or alternatively by part (a), 

lim E[|BP'*^'(t„)|] = lim E[|BP(f„)|] = A. (3.19) 

n— >(X) n— >oo 

Since 6162 .. . hj < rj^ tends to zero exponentially, this leads to 

lim E[|BP**'(t„)P] = 6i---6,_ia,-. (3.20) 

We bound the arising sum in the following lemma: 

Lemma 3.3 (Effect of truncation on variance CTBP) Forrm = Krj^^, and with v = K[X], 

V6i---6,_iaj- < . 3.21 

i=i ' 

Proof. We bound bj < ij, and 

aj < r]E[iX A m,)^] = r](m]F{X > m,) + E[XH{x<m,}]) , (3.22) 

so that 

00 00 00 

5^61 .. . bj.iaj < Y,m]¥{X > mj)rf +Y,E[XH{x<n.,}W ■ (3.23) 
j=i j=i j=i 

We bound both terms separately. The first contribution equals 

K'^nX > Kr')ri-' = K'ElY^r^-n^^^-^^x}] = ^ _ ~ ], (3-24) 

where a(a;) = max{ j : Krj"^ < x} = [log (x/i^)/ log (l/r/)] . Therefore, ri~°'^'^^ < X/K, so that 

^m|P(X > mj)7]^ < E[X/K] = — —. (3.25) 

j=i ~ ^ ~^ 



The second contribution is bounded in a similar way as 

i=i "j=i 



^' — 1 ^ — 1 / 



where b{x) = min{i: Kij ^ > x} = [log (x/-R')/log 1/r/)] > log (x/-fr)/ log (1/r/), so that rj^^-^^ < 
K/X. Therefore, 

J2nx\x<m,}W < r. E[KX] = (3.27) 
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as required. ■ 
Combining (3.20) with Lemma 3.3 yields: 

lim E[|BP<"\t„)l'] < 

n-j-oo [I — T]) 

SO that that (b) follows with C = . 

For Proposition 3.1(c), we start by showing that, for the original branching process (| BP(t)|)t>o, 

kt{x) 

^-at Y^E[\BPj[t,t + s)\] ^ A$(x)F«(s). (3.28) 

j=0 

Conditioning on the lifetime (with c.d.f. equal to G) of the first individual, after which the 
individual dies and splits in a random number offspring with mean i/, 

E[\BPj[t,t + s)\] = u [' E[\BPj ^i[t - y,t + s - y)\]dG{y). (3.29) 
Jo 

As before, 

\BPj[t,t + s)\ = e-"*|BPj[t,t + s)|. (3.30) 
Rewriting (3.29) we obtain the recursion 

E[\BP,[t,t + s)\]= [ E[\BP,^i[t-y,t + s-y)\]dG{y). (3.31) 
Jo 

Hence, if we continue to iterate, we get 

E[\BPj[t,t + s)\]= fE[\BP[t-y,t + s-y)\]dG*'{y), (3.32) 
Jo 

where G*^ is the j-fold convolution of G, and hence the distribution function of the independent 
sum of j copies of a random variable each having c.d.f. G. This is the point where we will use 
the CLT. For fixed s > 0, we define 

oo 

\BP^Jt,t + s)\= \^j[t,t + s)\. (3.33) 

j=m+l 

Observe that |BP[t,t + s)| = Yl'jLi |BPj[i,t + s)| is the total number of alive individuals of residual 
lifetime at most s, so that by [29, Theorem 24.1], (since G admits a density and 1 < < oo, the 
conditions of this theorem are fulfilled), 

oo 

lim E[|BP[t,f + s)|] = lim V E[|BP,-[f,t + s)|] = AFh(s), 

j=0 

where 

u — 1 u — 1 

au^ ye-^y dGiy) " ^^'^ ^ 

Hence, (3.28) follows if we show that 

E[|BP>,^(,)[t,t + s)\] ^ AF^is) - ^F«(s)$(x) = ^F«(s)$(-x). (3.35) 
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Note that 

rt 



E[|BP>,,(,)[f,t + s)|] = [ E[\BP[t-u,t-u + s)\]dG*''^y\u). (3.36) 

Jo 

to so large so that for t > Iq, 

E[\BP[t,t + s)\]-AFr,{s)\<e. (3.37) 



/o 

Take an arbitrary e > and take to so large so that for t > tQ, 



Then, 

E[|BP>,^(,)[t,t + s)|] -ylF^(s)$(-x)| (3.38) 
< eG*'^*("'(t) + ^F«(s)|G*"*(">(t) - $(-x)| + /" E[\BP[t-u,t-u + s)\]dG*''^«'>{u). 

Jt-tn 



The last term vanishes since E[|BP[t, t + s)|] is uniformly bounded and G*'°*''''(t) — G*'°*'^'(t — to) 
o(l). Furthermore, with m = kt{x) — >• oo. 



kt{x)^ — + x\ t^ t ^ mv — xa^/m. (3.39) 

As a result, by the CLT and the fact that u and a'^ are the mean and the variance of the 
distribution function G, 

lim G'*'=*W(t) = $(-x). (3.40) 

Together with (3.38), this proves the claim in (3.35), and hence Proposition 3.1(c). 

We continue with the proof of Proposition 3.1(a) for the n-dependent CTBP. We denote the 
number of alive individals at time t in the n-dependent CTBP by |BP(„)(t)|. We then have to 
show that uniformly in n, 

e-""*"E[|BP(„)(t„)|] ^ A (3.41) 

where A is given in (3.34). Denote by (/^(s) = e~^'^g{y) dy, the Laplace transform of the 
lifetime distribution (g is the density of G). Then 

'e--^E[|BP(„)(t)|]dt= (3.42) 

S(l - VnV{S)) 

This equation follows directly from [29, Equation 16.1], with m replaced by Vn and is valid 
when the real part of s satisfies Re(s) > an, where On > is defined as the unique value with 
Un^p(an) = 1. From the inversion formula for Laplace transforms, we obtain: 

E[|BP,„,(t)|] = ^ [ e-* / ~ ^^"^ dt, (3.43) 
27rz s(l - i/„99(s)) 

where F is the path (c — zoo,c + ioo), with c > an- Since an ^ a and — > 1 and ip{s) 
is the Laplace transform of a probability density, the function s(l — i'n(p{s)) has a simple zero 
s = an, but no other zeros in a small strip |s — a^l < S- It is now easy to conclude from Cauchy's 
theorem, calculating the residue at s = a„, that 



E[|BP(„)(t.)i] = e""^" i + o(e--^": 

= A„e""*"(l + 0(n-^/(2a„))^^ (3 44) 

where 
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Since An — )• A, by Condition 1.1(b), the claim (3.41) follows. 

For the second statement in Proposition 3.1(a) for the n-dependent CTBP, we replace (3.10) by 
the equivalent n-dependent statement: 



l-H — ^ (log(l/??))-^E[X„[log(X„/i^„)J], (3.46) 

1=1 

bmce Xn X, E[Xn] and hm sup^^oo E[X„ log{Xn/Kn)+]] = the statement follows. 

For the n-dependent case of Proposition 3.1(b), we need to show that uniformly in n, 

e-2°"*"E[|BP[:)'(t„)|2] <CKn, (3.47) 

for some constant C and where K is defined through mj = Knffn where rjn = Vn Jq° e~'^""^ dGn{y) 
and Vn = ^[Xn]- Copying the derivation which leads to (3.17), we obtain: 

oo 

E[|BP;j|2] = ^ 6^"' . . . bf\af %\BP[':f\f * Gf\ * Q^"\ (3.48) 
i=i 

where 

(„) _ 7?„E[(X„ A mjf] („) _ r,nE[{Xn/\mj)] 

a J , bj , (3.49) 

t ft 

1,, / „-2any , 



G„(t) = i/„ / e-""^a!G(2/), g„(t) = 7?-^r/„ / e"^""^ a!G(y). (3.50) 
Jo Jo 
From the proof of Lemma 3.3, we readily obtain that: 

oo oo oo jy. 

^ b[-^ . . . 6j"_^iaj"' < m]F{Xn > mjWn + E mllix.<n.M < ^rr^. (3.51) 

j=l j=l j=l 

Since, v and rjn ^ rj as n ^ oo, we find, by combining (3.48) with (3.51), that given e > 0, 

there is an no so that for n > no, 

e-2-*"E[|BP[r>(t„)P] < ^ < CKn. (3.52) 

[l - r] - e){v - e) 

By (if necessary) enlarging the constant C we see that (3.47) holds for all n. This proves Propo- 
sition 3.1(b) for the n-dependent CTBP. 

Finally, we consider Proposition 3.1(c) for the n-dependent CTBP. We denote by |BP(„)j[t, f + s)| 
the number of individuals in generation j having residual lifetime at most s at time t of the CTBP 
with offspring given by Xn- Then, we obtain, compare (3.32), 

E[|BP(„).>,[t,i + s)|] = / E[|BP(„,[t-y,t + s-y)|]dG;'=(y). (3.53) 

Jo 

As in the proof of Proposition 3.1(c), 

limE[|BP(„)(t)|] =^„F«(s), 

where was defined in (3.45). A small extension of the CLT yields that 

Since ^ ^, o^n ^ ot^ An — ^, as 72 — oo, it follows that 

E[|BP,„),>,,^(,)[t,t + s)|] = / nWUtn-y,tn-y + s)\]dG*>^''\y)^ AF^{s)^{x). (3.54) 

Jo 

This completes the proof of Proposition 3.1(c) for n-dependent CTBPs. ■ 
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4 Coupling to CTBP: Proof of Proposition 2.4 



4.1 The coupling 

The exploration of the total progeny of a branching process satisfies the same recurrence relation 
as in (2.2), apart from the fact that the random variables {Xk)k>i are i.i.d. for a CTBP. For 
CM„(d), clearly, (Xfc)^.>i are not i.i.d. We now describe stochastic relations between (Xfc)fc>i 
given in the CM and an i.i.d. sequence {Yk)k>i with distribution equal to that of — 1 for all 
k > 1, where the distribution of — 1 has probability mass function {g\^^)k>o defined by 

nK - 1 = ^) = 5^ = ^ E lR=fc+i}> ^ > 0. (4.1) 

^" i=l 

Recall that Ui, U2 are chosen uniformly at random from [n]. We continue with the definition of 

the size-biased reordering of [n] \ {Ui, U2}- 

Definition 4.1 (Size-biased reordering) Given the set [n] = {1,2, ...,n}, vertices Ui,U2, 
and degree sequence d, so that element i G [n] has degree di, a size-biased reordering of [re] \ 
{Ui^U2} of size rUn is a random sequence of (different) elements Vi, . . . ,Vm„, where we select 
Vi, 1 < ^ < n^-n o,i random from the set [re] \ {[/i, [/2j ^ii • • • ) ^-1} with probability proportional to 
the remaining degrees: 

{di,. . . ,dn} \ {dui,du2,dvi, • • • , 

Let Bi + 1 = dvi- We let Xi be the number of sibling half-edges of Vi that do not create cycles, 
i.e., are connected to vertices unequal to {Ui^U2-,Vi, . . . ^Vi-i}. Thus, clearly, Xi < Bi. The 
above set-up allows us to define the coupling between (-Bi)i>i and (yi)i>i) where (yi)j>i, is an 
i.i.d. sequence with distribution (4.1). 

Construction 4.2 (Coupling of size-biased reordering) We couple (i?i)j>i and (li)j>i in 
the following way: 

(a) Draw Yi as an independent copy of the distribution in (4.1). This can be achieved by drawing 
a uniform half-edge y from the total of in half-edges. Let V- = Vy denote the vertex to which the 
chosen half-edge is incident, and let Yi = dy — 1. 

(b) If V- {Ui, U2, Vi, . . . , Vi-i}, then Bi = Yi and Vi = ¥(, and we say that Vi is successfully 
coupled with V- . 

(c) IfV- G {Ui, U2, Vi, . . . , Vi-i}, so that we draw a half-edge incident to the set {Ui, U2, Vi, . . . , Vi-i}, 
then we redraw a half- edge from the set of half- edges incident to [n] \ {Ui,U2,Vi, . . . , Vi-i} with 
probability proportional to their degree, we let Vi denote the vertex incident to the half-edge drawn, 

Bi = dvi — 1, and we say that both Vi and V- are miscoupled. 

(d) We define Xi as the number of the dy- — 1 half-edges incident to vertex Vi that are not paired 
to a half-edge incident to {Ui, U2, Vi, . . . , Vi-i}. 

We next investigate the above coupling. For this, it will be useful to note that when Z?„, with 
distribution function Fn given in (1.2), satisfies Condition 1.1(c), then the maximal degree A„ = 
maxjg[„] di satisfies 

An = o{y^n/logn). (4.2) 
Indeed, suppose that A„ > e^n/logn. Then, pick Kn = n}!'^ to obtain that 

mpl log {DnlKn)^\ = - 4 logidk/n'^% > ^ log(A„/ni/4) 
^ n ^-^ re 

k=l 

> n-i(e\/re/logre)2log(n^/^/(logn)i/2) > (4.3) 
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This is in contradiction to Condition 1.1(c), so we conclude that (4.2) holds. 

We define the sigma-algebra Gi by Gi = (T{duj^,du2,XQ \XQ \Xj,Bj)j^^{^, see Section 2 for the 

definition of -'^o'' ^ ~ ^i^- 

Lemma 4.3 (Coupling to an i.i.d. sequence) Assume that Condition 1.1(c) holds. For all 
i < mn, and assuming that rUn < \/n log n, 




where, as before, Sq = Xq + Xq . 

Proof. In Construction 4.2, Bi ^ Yi precisely when V- G {Ui,U2,Vi, . . . ,Vi-i}, which, given 
Gi-i, has probability at most 




since Yi draws uniformly from a total of in half-edges, whereas in the previous draws at most 
X^Q^ + X^q'' + ^iz\{Bs + 1) half-edges are incident to the vertices {Ui, U2, Vi,..., Vi-i}. 

By (4.2), An = 0(1/71/ log n) so that 

i-l 

+ + Y^{B, + 1)< rUnAn = o(n). 

s=l 

The final statement in (4.4) follows from = n;u(l + o(l)). h 



Lemma 4.4 (Probability of draw^ing a half-edge incident to a previously found vertex) 

Assume that Condition 1.1(c) holds. For all i < nin, and assuming that rUn < \/n logn, 



Proof. Recall the definition of Si in (2.2). We have Xi < Bi precisely when we pair at least one 
of the Bi half-edges to a half-edge incident to {Ui, U2, Vi, . . . , Vj_i}. Since there are precisely Bi 
half-edges that need to be paired, and the number of half-edges incident to {Ui, U2, Vi, . . . , Vi_i}, 
given Gi-i, equals we find 

Clearly, Si-i < Sq + X^s=i ^s, which completes the proof. As before, Yll~=ii^s — 1) < "inA„ = 
o{n) a.s. when m„ < ^/n]ogn, which explains the in{^ — o(l)) in the denominator of (4.6). ■ 
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Coupling the flows. In the above, we have described the couphng of our size-biased reordering. 
We now extend this to a coupling between the SWGs and CTBPs. We will couple in such a way 
that vertices that are succesfully coupled (and thus have the same number of offspring half-edges, 
respectively, individuals), also receive the same weight along these half-edges. Therefore, the 
subtrees of successfully coupled half-edges in the SWG and individuals in the CTBP are found 
at precisely the same times. We will consistently refer to alive objects in the SWG as alive 
half-edges, and as alive individuals in the CTBP. 

More specifically, offspring half-edges or individuals of miscoupled vertices are by definition mis- 
coupled. The weights assigned to miscoupled half-edges in the SWG and individuals in the CTBP 
will be independent. Recall that Tk denotes the time at which the kth half-edge is found by the 
flow in the SWG. When the half-edge found is incident to a sucessfully coupled vertex, then we 
use Construction 4.2 to couple the number of offspring half-edges in the SWG to the offspring 
individuals in the CTBP. When the half-edge is incident to a miscoupled vertex, it is only present 
in the SWG, and we draw a half-edge from the set of available half-edges as in Construction 
4.2(c), ignoring Yk in Construction 4.2. We define {Tl)k>o as the times where an individual dies 
in the CTBP, but no half-edge is found by the flow in the SWG. These events result from one 
or more miscouplings between the CTBP and the SWG. At such times we draw a half-edge y 
uniformly at random from the total number of half-edges, and let Yi = dvy — 1 denote the number 
of sibling half-edges. Note that in this case, we do not rely on Construction 4.2, and thus Vy is 
not part of the size-biased reordering. 

Because of the above construction, differences arising in the coupling are due to two effects: 

(1) a miscoupling occurs: miscoupling between the size-biased reordering Bi and the i.i.d. draw 
from the degree distribution Yf, and 

(2) a cycle- creating event occurs: Here we refer to the occurrence of cycles, which makes Xi < Bi, 
and, by our construction of the collision edges, removes the Bi — Xi half-edges incident to vertex 
Vi, as well as the Bi — Xi half-edges to which they are paired from SWG. 

Recall that offspring of miscoupled vertices are also miscoupled, so any miscoupling gives rise to 
a tress of miscoupled children half-edges in the SWG, respectively, individuals in the CTBP. 

In order to be able to show convergence in probability to a random variable, we assume that all 
couplings are defined on one and the same probability space. 

4.2 Coupling the SWG to a CTBP: proof of Proposition 2.4(a) 

Consider an age-dependent branching process with lifetimes having a distribution admitting den- 
sity g, and offspring distribution given by {fk)k>i in the first generation and offspring distribution 
{gk)k>Q in the second and all further generations, and let D have probability mass function (p.m.f.) 
(/fc)fc>i- 

Let (D*) be a random variable such that D* — 1 {D* — 1) has p.m.f. (gk), i-e-, 



k + 1 



n 



{k + 1)F{D -l = k) 





(4.8) 



By Condition 1.1, — > D*, so that, since D* and D* are discrete distributions. 



(4.9) 



where dxy denotes the total variation distance, see for instance [55, Theorem 6.1]. 
Proof of Proposition 2.4(a). Take Sn maximal such that 



e2"^"dTv(I?;^,l)*) ^0. 



(4.10) 
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According to (1.17), and with i G {1,2}, 

e-"^"|BP«(s„)| ^ (4.11) 

where W**' are two independent copies of W. Since P(>V*'' < oo) = 1 and e"^" oo, we conclude 
that |BP(s„)| < kn whp if we take kn = [e^"*"]. If this kn does not satisfy kn = o{^/n) then 
lower Sn so that the corresponding value of kn = [e^"^*"] does satisfy k^ = o{y/n). 

Recall that Qi = a{diJi,du^,XQ \XQ\Xj,Bj)j^\^iy By Boole's inequality, 

+ Yi I Qi^x) < P(X, < Bi I g,_i) + F{Bi / I Qi.i). (4.12) 

Consequently, by Lemmas 4.3-4.4, a lower bound for the probability of coupling successfully 
during the first kn = o{^/n) pairings is 

kfi kn 

p( f|{x, = yj) = 1 - p( |J{x, / y,}) 

1=1 i=l 

-* kfi i 1 

- ^ " (1- m\ E (e[S*5o] + 1 + Y^inBiBs] + 1)) > 1 - ckl/n ^ 1. (4.13) 
Here we rely on the fact that 

I < Y: T^^^ = -nil + o(l)), (4.14) 

whenever knAn = o{n), which follows from (4.2). 

The lower bound (4.13) implies that, whp, the shortest weight graph {S\NG{s))s<s„ is perfectly 
coupled to the CTBP (BP(s))s<^5^. This proves Proposition 2.4(a). ■ 

We close this section by investigating moments of the size-biased variables (-Bi)j>i arising in the 
size-biased reordering. These moments play a crucial role throughout the remainder of this paper, 
and allow us to compare (i?i)i>i to an i.i.d. sample of random variables having the size-biased 
random distribution. 

Lemma 4.5 (Moments of the size-biased reordering) Assume that Condition l.l(a-c) holds. 
For all i < rrin, and assuming that rUn < yjn logn, and for any Kn — >■ oo such that Kn = o{n/mn), 

¥.[Ba{B,<K^} I Gi^l] = (1 + Op(l))l/n, (4.15) 
nBil{B,>K„} I = Op(l). (4.16) 

Proof. We use the upper bound 

(4.17) 

^"^^ ""^^'^ ie[n] 
where we again use that, since m„ < y/n log n, 

i-l 

in-So-Y, Bj >in- rUnAn = in - o{n). (4.18) 

i=i 
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This provides the necessary upper bound in (4.15) by taking a = and from the identity Un = 
J2ie[n] di{di — l)/in- For (4.16), this also proves the necessary bound, since 

^^diidi-l)l{d^^K,.}=o{l). (4.19) 
For the lower bound in (4.15), we bound, instead, 

E[Bil{Bi<K„} I Gi-l] ^ X] ~ ^)^{di<K„+l}'^{l not chosen yet}; (4.20) 

" l£[n] 

where the event "Z is not chosen yet" means that the vertex I has not been chosen in the size-biased 
reordering until time i — 1. We now bound 

E[-Bil{B,<i4'„} I Gi-l] > XI ~ l)lM<i^„+l} ~ di{di - l)l{d,<i^„+l}l{« is chosen}- 

tG|n] l€[n\ 

(4.21) 

The first term equals 1/^(1 + o(l)). The second term is a.s. bounded by rrinK^/in = o(l), since 
= o(n/m„). ■ 



4.3 Completing the coupling: Proof of Proposition 2.4(b) 

In this section, we use Proposition 3.1 to prove Proposition 2.4(b). In order to bound the 
difference between BP(t) and SWG(t), we will introduce several events. Let Bn,Cn,SniT^niEkn 
denote sequences of constants for which C„ — t- oo and e„ — )• arbitrarily slowly, and m„ ^ 
\/n,m„ <^ ^Jn. Later in this proof, we will formulate restrictions on these sequences. 

Define the event An as follows: 

An = {|SWG(t„ + 5„)ABP(„)(t„ + Bn)\< EnV^}, (4.22) 

where |SWG(t„ + i?„)ABP(„)(t„ + i?„)| is the number of miscoupled half-edges plus the number 
of miscoupled individuals. Then Proposition 2.4(b) can be reformulated as 

P(^U-^.J = Op(l)- (4.23) 



In order to prove (4.23), we introduce the following events: 

Bn = {r'^^Hin + Bn) < m„} H {y'^^«'(t. + Bn) < m„} 

n {y(«^'(t„ - Bn) < mj n {y'^^^H^n - Bn) < mj, (4.24) 

Cn = {SWG(t) = BP(t), Vt < tn - Bn}, (4.25) 

T>n = {^i such that Ti <tn + Bn, Vi miscoupled , dy. > C„}, (4-26) 

where 

y(BP)(t) = |{^;: V € BP(„,(s) for some s < t}\, (4.27) 
denotes the total number of individuals ever born into the BP before time t and 

y(swG)^^) ^ 11^. ^ ^ SWG(s) for some s < t}\, (4.28) 

denotes the number of half-edges in the SWG that have ever been alive before time t. Informally, 
on Bn, the total number of half-edges in SWG and individuals in the CTBP are not too large. On 
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Cn there is no early miscoupling, while on there is no miscoupled vertex having high degree 
until a late stage. 

Obviously 

I -^.J (4.29) 

< F{Bl I -F.J + P(C^ n Bn I TsJ + n Bn n c„ | J-,J + p(^^ n fi„ n c„ n p„ | j-,J. 

To bound conditional probabilitites of the form ¥{£^ \ J~s„)7 start by noting that it suffices 
to prove that P(f^) = o(l), since then, by the Markov inequality and for every e > 0, 



P(^:= \TsJ>e)< E[P(f ^ I TsJ]/e = nn/e = o{l). (4.30) 
Thus, we are left to prove that 

P(S^) = 0(1), P(C^nfi„) = o(i), P(p^ne„nc„) = o(i), P(^^nE„nC„nP„) = 0(1). (4.3i) 

We will do so in the above order. 

Lemma 4.6 (Expected number of particles born) For all t >0, 

E[y(«^'(t)] = 2(1 - ^^^) + ^^E[|BP(„)(t)|]. (4.32) 

Consequently, when (fr,{U+B^) ^ o(rn:„), e""*^*""^") = o{rnn)' 

nK) = o{l). (4.33) 

Proof. Note that we grow two SWGs and two BPs, which explains the factor 2 in (4.32). As 
is well known the expected number of descendants in generation A: of a BP equals v'^, where 
f„ denotes the mean offspring. Here, we deal with a delayed BP where in the first generation 
the mean number of offspring equals ^„ = E the factor G*''{t) — G*'-''^^^{t) represents the 
probability that an individual of generation k is alive at time t; together this yields: 

00 00 
E[\BPam = Y.'^f^-^n-'[G*'{t)-G*^'^'\t)], E[y(«-)(t)] =2 + J;2^„z.^lG-(^). (4.34) 

k=l k=l 

We can rewrite the equality for E[|BP(„)(t)|] to obtain 

00 

E[|BP,„,(t)|] =2^„G(t)K + ^2/i„[i/^i (435) 

k=l 

= 2/z„G(t)/i/„ - 2(1 - u-') + (1 - u-')E[Y^^-\t)]. 
Solving for E[y(sP'(t)] yields the proof of (4.32). 

To bound P(;BJ, we note that we have to bound events of the form P(y^^^(t) > m) and 
p(y(swG)(^) > m) for various choices of m and t. We use the Markov inequality and (4.32) 
to bound 

P(y(«^)(i) >m)< E[y(«P'(t)]/m < . . E[|BP(„)(t)|] + -. (4.36) 

— Ij m 

By Proposition 3.1(d), E[|BP(„)(t„)|] = A„e°"*"(l + o(l)), so that 

P(y<^^>(t„) > m) = e(e""*" /m). (4.37) 



33 



The conditions on t and m in Lemma 4.6 have been chosen precisely so that e""^*" ^"Vm„ — )• 0, 
and e""(*"+^"V"^n ~^ 0- 

We continue with P(y*^'^'^'(t) > m). We use the same steps as above, and start by computing 

oo 

E[y{swG)^^)] =2 + 2^G*'=(t)E[Pfc*], (4.38) 

fc=0 

where Pq = In/n and 

|7r|=fc,7rCCM„{d) 

is the sum of the number of half-edges at the ends of paths of lengths k in CM„(<i), from a 
uniformly selected starting point. See [39, Section 5] for more details on paths in CM„,(d). We 
compute that 

npn = - E <ri^%^' (4-39) 

where the sum is over distinct vertices in [n]. By [39, Proof of Lemma 5.1], the latter sum is 
bounded by 

nPi\ < nDn]iy^/n. (4.40) 
As a result, we have that E[y*^^'^^(t)] < E[y*^^^(t)], and we can repeat our arguments for 

E[y(«p)(t)]. ■ 

Lemma 4.7 (No early miscoupling) When e""(*"~-^") = o{m^) and = o{y/n), then: 

nC'n^Bn)=o{l). (4.41) 

Proof. By the proof of Lemma 4.6, whp Y^^^^tn — Bn) < m^. By (4.13), the probability that 
there exists a miscoupling before the draw of the m„th half-edge is o(l) when m„ = o{^/n). ■ 



Lemma 4.8 (No late miscouplings of high degree) // m„ < ^Jnlog n, and Cn satisfies 

T'Edn{d.>c.} = oin), (4.42) 

then 

P(p^ns„nc„) = 0(1). (4.43) 

Proof. On the event Bn- Y'-^^^{tn + Bn) < rUn- An upper bound for the probability of 
miscoupling a vertex of degree at least C„ during the first m„ pairings is thus 

^ ( 4(l-o(l)) )''^'^'^^"> - ^^(l-o(l)) ^ '^'^'^^^^"^ = ^'-'^^ 



Proposition 4.9 (Miscoupled vertices have small offspring) If nin < \/ n log n and 

e^°'"^"Cnfnl/£n = o{^/n), then 

¥{A'nr\Bnr\Cnr\Vn)=o{l). (4.45) 

Proof. We split the proof into three contributions, namely, a bound on |SWG(t) \ BP(„)(t)|, a 
bound on the contribution to |BP(„)(t) \ SWG(t)| due to cycle-creating events, and a bound on 
the contribution to |BP(„)(t) \ SWG(t)| due to miscouplings. We start with the first bound: 
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A bound on |SWG(t) \ BP(„)(t)|. By construction, the number of miscoupled half-edges in 
SWG(t) at any time t is bounded from above by 



MlS(i) 

i=i 



(4.46) 



where MlS(t) denotes the number of miscoupled vertices, Tj is the birth of the jth miscoupled 
vertex, and for the jth miscoupled vertex Vj, Yj^^'^\t) is the number of half-edges at flow 
distance (total edge weight) at most t from Vj. On the event Cn, Ti > tn — Bn- Therefore, for 
every t <tn + Bn, on the event C„, 



MIS(t„+B„) 

|SWG(t) \ BP(„)(t)| < E Yj''^''\2Bn). 



(4.47) 



By the Markov inequality. 



We rewrite 



E 



{|SWG(t) \ BP(„)(t)| > eV^} n s„ n n 

M\S{t„+Bn) 

i=i 



M\S{tn+B„) 

= (1 + o(l)) E IP(« miscoupled, ^„ n C„ n P„)E[y<^^'^'(25n)] 

ie[n] 
dijnn \ 2, 



(4.48) 



(4.49) 



ie[n] 



]E[y(swG)(25„)]^ 



where we use that, upon miscoupling of vertex i, we redraw a vertex from the size-biased distribu- 
tion, for which the number of half-edges found before time 2Bn is equal to E[y'^^'^^(2i?„)](l-|-o(l)) 
since m„, < y/n logn and i3„ occurs. Since E[y^^'^'(t)] < E[y*^^'(t)], we obtain that 

E[Yj'''^''\2Bn)] < ^„e2""^"(l + o(l)). (4.50) 

Therefore, we arrive at 

MIS(tn+B„) 



E 



(4.51) 



Combining (4.48)-(4.51) proves that |SWG(t) \ BP(„)(t)| = Or{^/n), since {e^/n)-'^ml/£n = o(l). 



Bounding the contribution to |BP(„)(t) \ SWG(t)| due to cycle-creating events. On the 

event Cn, Ti > tn — Bn- Recall that the jth miscoupled vertex is denoted by Vj, and that the 
time of the occurrence of the jth miscoupled vertex is Tj. On the event P„, dy, < C„ for every j 

for which Tj < tn + Bn- When a cycle-creating event occurs, the two half-edges that form the last 
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edge in the cycle are removed from SWG(t), but they are kept in BP(t). On the event i3„ n C„, 
the expected number of cycle-creating events is bounded by 

Furthermore, on the event ;B„ n C„, the expected offspring of the half-edges involved in cycle- 
creating events is at most 

0(^)E[y(2S„)], (4.53) 

where {Y(t))t>Q is the total number of individuals that have ever been alive in a CTBP where all 
individuals have i.i.d. offpring with law — 1, starting from — 1 individuals. Indeed, we have 
no information about the remaining lifetime of the half-edge involved in the cycle-creating event. 
As a result, we instantaneously pair it to an i.i.d. draw of a half-edge, and start the BP(„)(i) from 
there. The total number of individuals ever alive only increases by this change. On the event 
Vn, we have: E[y(2S„)] < C„A„e2""-f^"(l + o(l)). By assumption C„^„e2°"-f^"/4 = o{^/E). 
Therefore, the contribution to |BP(„)(t) \ SWG(t)| due to cycle-creating events is Op{^/n), as 
required. 



Bounding the contribution to |BP(„)(t) \SWG(t)| due to miscouplings. We complete the 
proof of Proposition 4.9 by dealing with the contribution to |BP(„)(t) \ SWG(t)| due to miscou- 
plings. Now, for BP(„)(t), we do not redraw the random variable Vj. We can give an upper 
bound on the contribution to BP(„)(t) of these miscouplings by instantaneously pairing the half- 
edges to an i.i.d. draw of a half-edge. As a result, the contribution to |BP(„)(t) \ SWG(t)| due to 
miscouplings can be bounded above by 



(4.54) 



j:Tj<t„+Bn 



We use that, on the event C„, Tj > tn — Bn, and on the event Bn, the expected number of 
miscoupling is at most 0(m^/^„). Finally, on the event P„, dy, < Cn for every j for which 

Tj < tn + Bn- Therefore, 



E 



{tn + Bn- r,-)lB„nc„n7?„ < C„A„e2""^"(l + o(l)). 



We conclude that, on the event Bn n C„ n Vn, 



E 



J2 Yj^^\tn + Bn - fj)ls„nc,.nv„] < o{^)Cne 



(4.55) 



(4.56) 



j: Tj<t„+Bn 



By assumption the r.h.s. is o{^/n). Therefore, the contribution to |BP(t) \ SWG(t)| due to mis- 
couplings is Op(\/n), as required. This completes the proof of Proposition 4.9. ■ 



Proof of Proposition 2.4(b). It suffices to show that we can choose the sequences Bn,Cn, £n,rnn,nin 
such that all conditions in Lemmas 4.6-4.8 and Proposition 4.9 apply. It is readily verified that 
we can take: 

rn,„ = -y/n/(log log n)"''^, nin = Vn{logn)^^^, (4.57) 

and 

En = 1/logn. 



Bn = log log log n, Cn = n 



1/4 



(4.58) 
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By Condition 1.1(c) 



E 



ten 



logn 



o((logn)-i), (4.59) 



all the conditions in Lemmas 4.6-4.8 as well as Proposition 4.9 hold. Therefore, the claim in 
(4.23) follows, which completes the proof of Proposition 2.4(b). h 



5 Height CLT and stable age: Proof of Proposition 2.5 

We first prove Proposition 2.5(a). By Proposition 2.4(a), at time s„, whp, {S\NG{s))s<s„ is 
perfectly coupled to the two independent CTBPs (BP(s))s<s^ . The proof contains several key 
steps. 



Reduction to a single BP. We start by showing that, in order for Proposition 2.5(a) to hold, 
it suffices to prove that for j G {1, 2}, x,t £M and s > 0, 



e-""*1BP^i^^(,,[t„ +t,tn + t + s)\ e°*c^(x)F«(s) (5.1) 



where we use (2.28) in Proposition 2.4(a) to see that ^/W^^^JW^ G [e,l/e] whp. Indeed, by 
Proposition 2.4(b) and the fact that e~""*" = n"^/^, (5.1) immediately imphes that 

g-a„t„|swGW^ + t,in + t + s)\ = e-""*"|BP^]^ + t,in + t + s)\ + e-""*"Op(e„ V^) 



e"*$(x)F«(s) (5-2) 
Therefore, independence and (5.1) also proves that for j £ {1,2}, x,y,t G M and si,S2, 

e-'""*1SWG!fl^^(,)[tn + t,tn + t + si)||SWG^V5^)[t„ + t,tn + t + S2)\ (5.3) 
^e2"*$(x)f(y)F^(si)F«(52), 
which is the statement in Proposition 2.5(a). 



Using the branching property. To prove (5.1), we note that (BP*^'(s))s>s,j is the collection 
of alive individuals in the different generations of a CTBP, starting from the alive particles in 
(BP(^'(s„)). We condition on (BP(s)<^^)se[o,.„]- Then 

|BP<l„w[i~n + t,in+i + s)| = J2 E \BP^k''\tn + t-Sn-Ri,in + t + S-Sn-Ri)\, i^-^) 

ieBp(j)(s„) k=i 

where G-^' is the generation of i € BP*^'(s„), while Ri is its remaining lifetime, and (BP'''-''(t))(>o 
are i.i.d. CTBPs for different i, for which the offspring for each individual has distribution D*. 



Truncating the branching process. We continue by proving that we can truncate the branch- 
ing process at the expense of an error term that converges to zero in probability. We let BP*'"-''™' 
denote the branching process BP'*-'' obtained by truncating particles in generation I (measured 
from the root i) by mi = Knrj~^ ■ We take Kn — >• oo such that Kne~°''^'^'^ = o(l). We first show 
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that, as t ^ oo, we can replace e"""*"|BP^'^^^(^j[t„,t„ + by e"""*"|BPl!;f;^('^j[t„,t„ + s)|, at the 
expense of a Op(l)-term. Indeed, with 



|BP(''^'(t)| =J2\BPf'\t)\, |BP(''^'™'(t)| = ^|BP^''^'"^(t)|, 



(5.5) 



k=l 



k=l 



we have, uniformly in t > and A; > 0, by the n-dependent version of Proposition 3.1(a) in 
Proposition 3.1(d), 



e-""*E 



BP^^^\t,t + s)\-\BP^r'[t,t + s 



< e"""*E 



Bp(»'^)(i)| - |BP(^'^'")(i)| = o(l). 



Therefore, using that the law of only depends on J-g^ through Ri,tn, 



-Otntr, 



Yl ^\\BPt''\tn + t-Sn-Ri)\-\BP'C'''^\tn+t-Sn-Ri)\\Ts., 



< e-°"*" Y lE[|BP<t(.)(*n + t-Sn- R^)\ " I BP^ctr,*., (*n + ^ " " ^^1 I 



ieBp(^)(s„) 

3(1) ^ gan(fn-t„+«-^n-R«) ^ 0^(1)6"°^" ^ , 

jGBP(j'(s„) jeBP(^)(s„) 



-aRi 



(5.6) 



since the random variable \tn — tn\ is tight, and assuming that s„ — )• cx) so slowly that s„|a„ — a| 
0(1). By [33, 35] and with ai = s — Ri, the birth-time of individual i, 



E 



-a„Ri 



(5.7) 



ieBpO)(s„) 



where we use the fact that {Ms^^)s>o is an n-independent martingale by the remark on [34, p. 
234 hue 7]. We conclude that 



|BP^^], M[tn+t,tn+t + s) 



(5.8) 



kt„{x)-GY 

Y Yl \BPk''^^[tn + t-Sn-Ri,tn + t + S-Sn-Ri)\+Oril). 

jeBp(J'(s„) k=i 



A conditional second moment method: first moment. We next use a conditional second 
moment estimate on the sum on the right-hand side of (5.8), conditionally on J-'s„- By the 
n-dependent version of Proposition 3.1(c) in Proposition 3.1(d), uniformly in n and for each 
i G BP*^'(s„) and /c„ = o{^/logn), 



^ — O-ntr 

As a result, when tn + t — Sn 



E 



A$(x)F^(s) 



Dp(i,i,»") 
°^<fet„ (^) 



Ri oo and G^^' = Op(\/Iogn), 



Ri 



(5.9) 



(5.10) 



-a„{t„+t- 



BP<(.j [tfi -\- t Sfi Ri, tji ~\~ t -\- S Sn -^i ) I I Ri ) 



A$(x)F«(s)[l + Op(l)]. 
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This yields that, when = Op(\/logn) (which happens whp when s„ is sufficiently small), 

e[ ^ \BPl''''^\tn+t-Sn-Ruin + t + S-Sn-Ri)\\FsJ^ (5.11) 

ieBp(J)(s„) k=l 

= ^e"*$(x)F«(s)[l + Op(l)] ^ e«n(f„-tn-.n-fi.) 



-a„t, 



jeBp(J'(s„) 



$ (x) (s) vW^VW(^, 



since e""(*"-*") = y Wii'Wl'"'' ^ VW^^Wf^^, whereas by (5.7), EigBpO){s„) ^ 
W*^' /A. 



A conditional second moment method: second moment. We next bound, conditionally 
on J-s„, the variance of the sum on the right-hand side of (5.8). By conditional independence of 

(BP("-^')i>i, 



-2a„ 



.ninVar( E \^^t'''^\in+t-Sn-Ruin + t + S-Sn-Ri)\\Ts^] (5.12) 

iGBP(-')(s„) k=l 

^ ^-2a„t„ y^^(^ \BP^C''^^[in + t - Sn - Ri,in + t + S - Sn - Ri)\ \ J's^] ■ 

jeBp(j)(s„) k=i 
Since adding individuals and enlarging the time-frame does not reduce the variance, 



Var( \BP^r^'^[ir,+t-Sn-Rutn + t + S-Sn-Ri)\\Ts, 



(5.13) 



k=l 



< E[|BP(''^''^'(t„ -Sn- Ri)f I = E[|BP<''^''^'(t„ -Sn- Ri)\^ I R^,tn 



By the n-dependent version of Proposition 3.1(b) in Proposition 3.1(d), uniformly in n and t, 



e-2°"*E[|BP("'^'™)(t)P] <CKn. 



(5.14) 



As a result, 



< e 



^ ^ \BP^^-'-'^\tn + t-Sn-Ri,tn + t + S-Sn-Ri)\\Ts„) (5.15) 

iGBP(^)(s„) k=l 

2a„t„ ^ E[|BP(^'^"'™)(t„ -Sn- Ri)\^ I 

jeBp(J)(s„) 

2antn 



< CK„e- 



jeBP(j)(s„) 

ieBpW)(s„) 

since e^"" (*"-*") = 0^(1). We can bound this further as 



2clr), 5ri 



E 



-2«„_Ri 



ieBp(J)(s„) 



^^g-2a„s„ ^ g-2a„iJ, < |^g-a„s„ ^ e"""-^^) = Op(l)i^„e" 



jeBp(J)(s„) 



ieBp(-')(s„) 



Op(l), 

(5.16) 
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precisely when i^„e~""^" = o(l). By (5.11) and (5.15), the sum on the right-hand side of (5.8) is, 
conditionally on J-s^^ concentrated around its asymptotic conditional mean given in (5.11). As a 
result, (5.1) follows. This completes the proof of Proposition 2.5(a). ■ 

In order to prove Proposition 2.5(b), we need to investigate the asymptotics of the sum ^™ , 
where m = |SWG^]_ ,Mn + t,t„ + 1 + S2)| ^ oo on the event that W'^'^Wf^ > 0, and (X*)i>i 
are X* = dy^ — 1 with {Vi)i>i the size-biased reordering of ('^i)ig[n]\Sm' where is the set of 
vertices found in SWG(t„ + 1). We will prove that, conditionally on Ftn+ti 



'-Y.^t^l, (5.17) 



1=1 



and then the proof of Proposition 2.5(b) follows from the proof of Proposition 2.5(a) and the fact 
that Vn ^ V- 

By the proof of Proposition 4.9, |Sm| < whp, where m„ = -y/n(log n)^/^ as in (4.57). As 
a result the sequence ('ii)ig[n]\Sm satisfies Condition 1.1 when ((ii)jg[„] does. Hence Lemma 4.5 
holds with Bi replaced by X*, so that in particular, from the Markov inequality, conditionally on 

^ m 

-Y.^tHxt>K^}^^. (5.18) 

1=1 

We use a conditional second moment method on YllLi-^i'^{x*<K„}i conditionally on J^t^+t- By 
(4.15) in Lemma 4.5, 



rn 

[Y,Xtl^xt<Kr.} I -^f„+t] = + Op(l))- (5.19) 



i=l 



This gives the asymptotics of the first conditional moment of YlT^i -^i'^{X*<Kn}- second 
moment, we start by bounding the covariances. We note that, for 1 < i < j < m, 



Gov X*l|xr<i^„},^;i{x;<A'„} I ^t„+t (5.20) 



= E y<^t'^{X*<K„} [^[Xjl{X*<K„} I ^t^+t, X*,..., X*] - E[Xjl{X*<Kr,} I ^t„+t]j I -^Tn+i 

By (4.15) in Lemma 4.5, as well as the fact that i < rrin = o{n), 

E[x;i|x;<i^„} I -^f„+^,^^, ■■■,Xt]- E[X]1{x*<k„} I ^t„+t] = Op(l), (5.21) 

so that also 

Coy[x:i^x*<k„},XP^x^<k,,} I J't^+t) = Op(l). (5.22) 
Further, a trivial bound on the second moment together with (4.15) in Lemma 4.5 yields that 

Var(x;i{;,.<^„} I < KnE[Xt \ = K^Ml + Op(l)). (5.23) 

As a result, whenever Knm = o(m^), 

m 

^/av(^Y.^^Mxt<Kr.} I J'-t^+t) = oAm'), (5.24) 

i=l 

which together with (5.19) proves that, conditionally on J^in+t^ 

^ m 

1=1 

Together with (5.18), this proves (5.17), as required. ■ 
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6 Extensions to other random graphs 



Proof of Theorem 1.6. Let UG„(d) be a uniform random graph with degree sequence d. By 
[12] (see also [13]), we have that the law of UG„(d) is the same as that of CM„(<i) conditioned 
on being simple, i.e., for every sequence of events Tin defined on graphs with vertex set [n], 



(UG„«i) e «„) = r(CM„«i) . «„ I CM„(</) simple) = H?"'"*'^) " «»• --P'^) 



P(CM„(d) simple 

By (1.24), it suffices to investigate P(CM„,(d) G -?^„,CM„(d) simple). We take 



(6.1) 



r - 7„ log n 1 -1 

^n = \ <x,Ln logn<yk (6.2) 

VP log n a„ J 



where -L„ and Hn are the hopcount and weight of the optimal path between two uniformly selected 
vertices conditioned on being connected. 

By the results in Section 2, and with tn = tn + Bn, where Bn = log log log n is defined in (4.58), 
whp, we have found the minimal weight path before time tn- The probability that we have found 
a self-loop or multiple edge at time t„ is negligible, since, by that time we have found of order 
= y/nijogny/^ vertices and paired of order edges, see Lemma 4.6. Let diitn) denote 
the number of unpaired half-edges incident to vertex i at time i„. Since CM„(d) is created by 
matching the half-edges uniformly at random, in order the create CM„(d) after time tn, we need 
to match the half-edges corresponding to ('ii(in))jg[n] • This corresponds to the configuration 
model on [n] with degrees (di(tn))iG[n] • Since we have found of order m„ = -y/n(log n) vertices 
and paired of order m„ edges at time in, when d satisfies Condition 1.1, then so does (di(in))jG[n] 
with the same limiting degree distribution D. As result, the probability that the configuration 
model on [re] with degrees (di(tn))ig[n] is simple is asymptotically equal to ^-^1"^-^ /^(l + o(l)), 
and we obtain that the event that CM„(<i) is simple is asymptotically independent of the event 
Tin in (6.2). Therefore, Theorem 1.6 follows from Theorems 1.2-1.3. ■ 



Proof of Theorem 1.7. By Janson [38], when Wn ^ W and E[W^], the in- 

homogeneous random graphs with edge probabilities in (1.27), (1-29) or (1.30) are asymptot- 
ically equivalent, so it suffices to prove the claim for the generalized random graph for which 
Pij = WiWj/{£n + WiWj). As explained in Section 1.4, conditionally on the degrees in the 
generalized random graph being equal to d, the distribution of the resulting random graph 
is uniform over all random graphs with these degrees. Therefore, Theorem 1.7 follows from 
Theorem 1.6 if we prove that Condition 1.1 follows from the statements that Wn W, 
E[Wn] ^ E[W],E[W^] ^ E[W^] and limnE[W^log{Wn/Kn)+] = 0. We denote by d = ((i,)ig[„] 
the degree sequence in the generalized random graph, and note that d now is a random sequence. 
We work conditionally on d, and let En, En denote the conditional probability and expectation 
given d. Then, we prove that P„(L>n = A;) ^ E{D = A;),E„[L»„] ^ E[D],En[Dl] E[D2] 
and En[Dl log(Dn/if„)+] ^ for every Kn oo. 

We let Dn = dy, where V G [re] is a uniformly chosen vertex. First, by (1.31), Dn D, where 
-D is a Poisson random variable with random intensity W. Further, 

^n[Dn] = -y2di, En[Dl] = - dl (6.3) 
re ^-^ n ^-^ 

i€[n] ie[n] 

where di = ^^j^in] j=/=i-^ij ^ij independent Bernoulli variables with parameter pij = 
WiWj /{in + WiWj). It tedious, but not difficult, to show that the above sums are concentrated 
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around their means, for example by computing their means and variances and showing that the 
variances are of smaller order than their means squared. We omit the details. 

In order to show that E„[Z)^ log(I?„/Er„)+] = Op(l), we note that 

E„[D2 \og{Dn/Kn) + ] = - V log {di/Kn) + . (6.4) 

n ^-^ 

ie[n] 

As before, di = Ylj&in] j^i-^ij ^^^^ ^ij independent Bernoulli variables with parameter pij = 
WiWj/{ln + vJiWj). By standard Chernoff bounds, there exists a constant a > such that, for 
every A > 2, 

P(d, > AE[di]) < e-'^^^l'^'l (6.5) 

Here, 

£ _ 

= 22,WiWjl{in + WiWj) G Wi{ ^ (1 + o(l)) '"^^' ^^'^^ 

since max^ Wi = o{^/n). As a result, 

En[Dl \og{Dn/Kn) + ] < log i2Wi/Kn)+ + ^Yl 1{*>2-J^' {d^/Kn) + . (6.7) 

ie[n] je[n] 

The first term vanishes by the fact that lim„_j.oo 

E[W^ log {Wn/Kn)+\ = 0. The second term can 

be split as 

1 1 °° 

- '^{dr>2w,}dj log {di/Kn)+ < - X] l{d,6[2''-«;„2fe+i«;,)}^? idi/Kn)+ (6.8) 

igfnl fc=liG[nl 



^ oo 



n 

ig[n,] fc=l 



By (6.5)-(6.6) with A = 2^, the mean of the above random variable vanishes, which, by Markov's 
inequality, implies that it converges to zero in probability. ■ 
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